SEMrush keeps taking down my site

My site is being absolutely bombarded by page requests originating from semrush.com. Its a bot that is indexing and downloading links to my entire site, which would be fine if it were a small website, except that I have content going back to 1996, which is rather a lot, and they are hitting the server with so many requests that Apache2 and/or MySQL Server 5.7 is crashing.

It’s utterly frustrating, and they have taken the site down 3 times in the past couple of hours. Their bot should be rate-limited so that it doesn’t cause issues like this! The site can cope with 100+ simultaneous visitors as it is running on an AWS T3 micro instance, which should be more than capable, but unfortunately, its not capable of dealing with the SEMrush bot.

In the coming hours I will be working on ways to permanently blacklist their services, and I suggest that other website owners do the same so that the same misfortune does not befall their site too.

MySQL Collation Error

Occasionally I come across databases written by other developers who have a complete mix of character sets, and where the database character set does not match the character set of fields within a table. This causes all sorts of nasty errors when trying to use CONCAT or CONCAT_WS. Errors like:

Illegal mix of collations for operation ‘concat_ws’)

The most common cause is that the database type does not match the field or table types.

1. Identify the character set that the fields within the table use with the following sql script:

SELECT
COLUMN_NAME,
CHARACTER_SET_NAME,
COLLATION_NAME
FROM
information_schema.`COLUMNS`
WHERE
table_name = “yourtablenamehere”;

 

2. Identify the character set that the database uses:

SELECT
default_character_set_name
FROM
information_schema.SCHEMATA
WHERE
schema_name = “solution9ssu”;

 

These should be the same, however a very common difference is that some developers create the database using utf-8 and the fields imported are in latin1_swedish_ci. This causes the collation error.

To solve: alter the character encoding of each table. NB. This can cause all manner of issues, so take a full backup first!

ALTER TABLE yourtablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;