DNS/azure/Moredetalsについて、ここに記述してください。

https://twitter.com/takekazuomi/status/1124507499501985793

昨日のAzureの障害、RCAが出てた。 変更プロセスをしくじって、4つのネームサーバーのうち1つレコードが空白のゾーンデータになり、nxdomain を返すようになった。その結果、http://database.windows.net などのクエリの25%が失敗するようになったのが原因 11:54 - 2019年5月4日

watchA/database.windows.net

1. history

https://azure.microsoft.com/en-us/status/history/

More details: This incident resulted from the coincidence of two separate errors.

1) Microsoft engineers executed a name server delegation change to update one name server for several Microsoft zones including Azure Storage and Azure SQL Database. Each of these zones has four name servers for redundancy, and the update was made to only one name server during this maintenance.

2) As an artifact of automation from prior maintenance, empty zone files existed on servers that were not the intended target of the assigned delegation. This by itself was not a problem as these name servers were not serving the zones in question.

Due to the configuration error in change automation in this instance, the name server delegation made during the maintenance targeted a name server that had an empty copy of the zones.

Since only one out of the four name server's records for the zones was incorrect, approximately one in four queries for the impacted zones would have received an incorrect negative response.

DNS resolvers may cache negative responses for some period of time (negative caching), so even though erroneous configuration was promptly fixed, customers continued to be impacted by this change for varying lengths of time.