In the past week, our Yelp API was experiencing slow response times. A search request that typically takes 3.x seconds (1.x seconds with Ludicrous Speed) was taking 40-50 seconds on average. We didn't push any update at that time, so we thought Yelp might have upgraded their detection mechanism. After checking our logs, we noticed an increasing of number of 403 - Forbidden errors. We still saw some successful requests, but only a very small percentage. It was clear that we needed to resolve the 403 - Forbidden error.

It could have been a complex problem if we had to deal with cookies, randomly generated keys in the header, or other similar factors. However, after a few trials and errors, we were able to mitigate the problem simply by adding Accept-Language header.

We swiftly implemented and deployed it to production, and our customers are now experiencing the same fast response times as before.


It’s great that the solution works, but I’m still curious about the reason behind it. The first question to consider is whether websites truly respect the Accept-Language header. If they do, what kind of websites are they? Probably those with global visitors, like Yelp. But does Yelp actually make use of it?

Yelp Home Page

I have changed my language settings in Firefox to Japanese (so it's obvious to see the difference) and verified that the request was sending the correct Accept-Language: ja header. However, Yelp still responded with English text.

I observed the same results even when using a VPN set to Japan. The default location was correctly set to a place in Japan, yet the response remained in English, concluded that they don't actually use the header to determine the language to be served.

The only way to change the language is to visit the specific domain that Yelp provided. For example, selecting Japanese language will redirect you to www.yelp.co.jp.

This is a common practice in many web applications, they either redirect visitors to a specific domain for the language or differentiate by using path like /ja.

Interestingly, Google does respect the Accept-Language header. While testing in Yelp domain, the Google Sign-In widget popup appeared in Japanese. To confirm, I performed a search on Google and it worked as expected.

A simple test request using Postman also confirmed this.

Meanwhile, Yelp appeared in English, as we observed.

There are several discussions (on StackOverflow and article) about the Accept-Language header. I think the takeaway is that it is probably a good design decision to consider Accept-Language header when designing a multiple language web application.

As far as web scraping is concerned, it is a good practice to include the common headers in every request: User-Agent, Accept, Accept-Encoding, Accept-Language, Connection, Cache-Control.

Conclusion

Accept-Language should be respected when designing web applications for multiple languages, as it indicates the user's preferred language. Other indicators, such as location, can be considered afterwards. Does that make sense? Feel free to reach out to me if you’d like to share your opinion.

Feel free to contact me (terry@serpapi.com).


Join us on X | YouTube

Add a Feature Request💫 or a Bug🐞