Top 5 HealthCare Data Sets For Data Science Projects


If you’re interested in working in the healthcare industry as a data scientist/analyst, you probably prefer to learn with real data sets. The good news is that the internet has quite a few good sources to download these sort of data sets. Here are my top 5 Healthcare Data Sets!

1. Health and Medical Care Archive (HMCA) https://www.icpsr.umich.edu/icpsrweb/content/HMCA/index.html

Very straightforward website. You just need to click on ‘Find Data’, and then you can browse by subject or look up recent additions.

2. UC Data http://ucdata.berkeley.edu/data_record.php?recid=6

This is from UC Berkeley.  You can browse by topic (right hand side). Some links that don’t work but generally a great resource!

3. UNEP http://geodata.grid.unep.ch/

The United Nations Environment Programme has data related to Freshwater, Population, Forests, Emissions, Climate, Disasters, Health and GDP. It covers about 500 variables. So you can pick and choose and create your own dataset with the variables of your choosing!

4.  Global Health Data Exchange http://ghdx.healthdata.org/

On the main page, near the bottom, it allows you to explore the site by data type (ie census, survey, financial record, etc), or by keyword, organization, or survey family/series/systems. Again, very simple to use.

5. data.gov https://www.healthdata.gov/search/type/dataset

This is the most obvious source, but a great source nevertheless. You can choose from topics like ‘State’, ‘National’, ‘Medicare’,’Hospital’, etc.

I hope you’ve found this list useful, and that it helps on your learning journey!




