Difference between revisions of "MusicBrainz API/Rate Limiting"

From MusicBrainz Wiki
Line 10: Line 10:
  
 
=== Current blocking rules ===
 
=== Current blocking rules ===
 +
 +
As of 2011-12-19 our blocking rules are:
  
 
{|
 
{|
Line 43: Line 45:
 
| 0
 
| 0
 
|}
 
|}
(last update: 2011-12-19)
 
  
 +
As of 2011-12-20 we hope to move python-musicbrainz/0.7.3 to be throttled at a max 500 requests/10 seconds. We will update this page once we put the new throttling into place.
 +
 +
Note: We may change the blocking/throttling rules at any time in order to protect the overall site health.
 +
 +
=== Providing meaningful User-Agent strings ===
 +
 +
To be a good citizen in the MusicBrainz world, please provide a good User-Agent string. We suggest that your User-Agent string should look like:
 +
 +
Application name/<version> ( contact-url )
 +
 +
for example:
 +
 +
MyAwesomeTagger/1.2.0 ( http://myawesometagger.example.com )
 +
 +
=== What can I do if my application is blocked? ===
 +
 +
If you are an application author and your application has been blocked, you will need to change your application to send us a proper User-Agent string. See the section above on how to do this. If you have a proper User-Agent string and your application has been blocked, you should ContactUs to find out why we blocked you.
 +
 +
=== How can I be a good citizen and be smart about using the Web Service? ===
  
 +
Please refrain from having your applications wake up at a certain time of the day to perform some action. For instance, having your application wake up at 03:00 local time and query a lot of data at MusicBrainz is ''a bad idea''. If this application gets distributed to many users around the globe (e.g in a Linux distribution) then at various times around the clock, but always at the beginning of the hour, MusicBrainz will be overloaded with requests from your application. Also, 03:00 in your timezone might be the peak time for MusicBrainz somewhere else in the world. If you program your application this way and we notice it, we will block your application.
  
 +
If there is a task you would like to perform in the background and are tempted to do it at an off-peak time, you should have your application make calls at random intervals throughout the day. If the application spreads out its calls throughout the day, its spreads the load on the MusicBrainz servers across the day as well and avoids creating artificial peak times.
  
* Explain why doing things at certain times of day are not smart
+
Also, if your application polls MusicBrainz to see if some metadata has changed, ''please don't do this''. Metadata really doesn't change all that often and therefore polling for changes will not often give good results. We currently do not have a good solution in place to let users know when metadata does change, but it is something we would like address in the future.
* Explain our current blocking and give a schedule for when we move to throttling/blocking.
 
* Mention that we can change the numbers at any time to protect the operation of the site.
 
* Mention what a user/author can do about it and mention that our libs now support UA setting.
 

Revision as of 00:17, 20 December 2011

Introduction

MusicBrainz has finite resources and wishes to make the MusicBrainz database available to as many the Internet community. However, at certain times of day the number of requests on XML_Web_Service exceed our capacity for handling these requests. If we attempted to honor each of these requests, we would overload all of our servers and that would degrade the service for everyone. For this reason we rate limit our Web Service, which limits the number of requests that clients can make in a given period of time.

When a request reaches our servers we check three conditions, in the following order:

  1. Source IP address: If a single IP address is making one request per second, the requests are all honored. If an IP address goes above 10 requests in 10 seconds, we instantly block that IP address and all requests from that IP address get a 503 error when making further requests. As soon as enough time passes such that the request rate for that IP drops below 10 in the last 10 seconds, requests are honored again.
  2. Global rate limit: If the total number of requests coming in to MusicBrainz exceeds our global rate limit in 10 seconds, all requests are rejected with a 503 error. This continues until the total count of requests in the last 10 seconds drops below the global rate limit. The current rate limit is set at 2,500 requests per 10 seconds. (equivalent to 250 requests per second)
  3. User-Agent string: Each application making requests to our web service must identify itself using the User-Agent string in the HTTP request header. If your application provides no User-Agent string, we will reject the request with a 403 Forbidden error. We have also blocked very common User-Agent strings from generic HTTP request libraries (see below for details). Some applications that use our own client libraries are making too many requests in certain times of the day, which has forced us to block these applications. Soon we're going to throttle some User-Agent string, rather than blocking them.

Current blocking rules

As of 2011-12-19 our blocking rules are:

User-Agent String Version Action Allowed Rate
<blank> - blocked 0
Java any blocked 0
Python-urllib any blocked 0
Jakarta Commons-HttpClient any blocked 0
python-musicbrainz 0.7.3 blocked 0

As of 2011-12-20 we hope to move python-musicbrainz/0.7.3 to be throttled at a max 500 requests/10 seconds. We will update this page once we put the new throttling into place.

Note: We may change the blocking/throttling rules at any time in order to protect the overall site health.

Providing meaningful User-Agent strings

To be a good citizen in the MusicBrainz world, please provide a good User-Agent string. We suggest that your User-Agent string should look like:

Application name/<version> ( contact-url )

for example:

MyAwesomeTagger/1.2.0 ( http://myawesometagger.example.com )

What can I do if my application is blocked?

If you are an application author and your application has been blocked, you will need to change your application to send us a proper User-Agent string. See the section above on how to do this. If you have a proper User-Agent string and your application has been blocked, you should ContactUs to find out why we blocked you.

How can I be a good citizen and be smart about using the Web Service?

Please refrain from having your applications wake up at a certain time of the day to perform some action. For instance, having your application wake up at 03:00 local time and query a lot of data at MusicBrainz is a bad idea. If this application gets distributed to many users around the globe (e.g in a Linux distribution) then at various times around the clock, but always at the beginning of the hour, MusicBrainz will be overloaded with requests from your application. Also, 03:00 in your timezone might be the peak time for MusicBrainz somewhere else in the world. If you program your application this way and we notice it, we will block your application.

If there is a task you would like to perform in the background and are tempted to do it at an off-peak time, you should have your application make calls at random intervals throughout the day. If the application spreads out its calls throughout the day, its spreads the load on the MusicBrainz servers across the day as well and avoids creating artificial peak times.

Also, if your application polls MusicBrainz to see if some metadata has changed, please don't do this. Metadata really doesn't change all that often and therefore polling for changes will not often give good results. We currently do not have a good solution in place to let users know when metadata does change, but it is something we would like address in the future.