There was someone a few days ago just saying to send data from a single clan for 2 weeks and they could prove/disprove ... Surely that is not randomly sampled?
There are a large number of users reporting data for the full 2 months. Data is linked to an anonymous user profile so you can follow each user over 2 months. and see statistically if users stop reporting when they get a sacred...
If the data was biased due to non-random sampling then I am sure you could show this in the first month of all users data. The bias would show if you compared the histogram to the expected distribution from the data-mined model.
Remember this initial experiment was to show users that their drop data did conform to the data mined model as in their first report. The bimodal distribution came out as unexpected in the second month with many of the same user's data. Not just an injection of new users data as I understand it. So unless you think the users deliberately falsified data then how can some users continue to get normal results when a statistically significant change in the distribution for the next month.
Remember there is a mathematical model with probabilities coming from data mining which is generally accepted by the community (although no information was released enabling users to verify this). This gives you the whole population statistics.
The vast majority of users report chest drops that agree with the predictions of the data-mined model for the first month. Most users report the same for the second month. A smaller but (significant depending on your test) report a changed drop rate distribution on the second month.
You can filter the raw data using your own skills and professional experience. Only take UNM chests, reject users who give an impossible data value such as 6 saceds (max 2 from 2 chests). If you think users stopped giving data when they got a sacred drop then remove users who conform to this pattern.
Any professional or university student would have access to mathematical analysis software (eg MatLab stats package) which can very quickly perform various hypothesis tests.
1 That the results were all produced by the data mined mathematical mode for the 1st and 2nd months.
2. That some users drop probabilities changed in the second month to form the bimodal distribution that can be represented by the data-mind probabilities and a single second set of probabilities.
If someone could show a mistake in the data or the analysis then it would be great, problem solved. Surely there is more that can be done than just dismissing it?
"There was someone a few days ago just saying to send data from a single clan for 2 weeks and they could prove/disprove ... Surely that is not randomly sampled? "
You did not hear me saying that! 100% agreed this would not be a random (or sufficient) data.