OpenNews/hackdays/insideroutsider: Difference between revisions

Jump to navigation Jump to search
Line 33: Line 33:
=== Project Ideas ===
=== Project Ideas ===
To pre-seed ideas for the hack day, we invited some of the attendees to outline some basic concepts of things to build/problems to solve. If you like these, please add a +1 next to the item. If you have your own ideas, please feel free to add them below. If you have a team already coalescing around an idea, feel free to kick it over to HackDash.
To pre-seed ideas for the hack day, we invited some of the attendees to outline some basic concepts of things to build/problems to solve. If you like these, please add a +1 next to the item. If you have your own ideas, please feel free to add them below. If you have a team already coalescing around an idea, feel free to kick it over to HackDash.
=== Data "White Whales" ===
As a way of giving attendees some things to chew on right out from jump, we invited a number of civic data experts to give us lists of their data "white whales"--high value datasets that are currently difficult to access.
'''From Derek Willis, New York Times:'''
* [http://clerk.house.gov/public_disc/foreign/index.aspx U.S. House of Representatives Foreign Travel Reports]
* [http://www.state.gov/r/pa/prs/appt/2013/index.htm U.S. State Department Public Schedule]
* [http://www.whitehouse.gov/omb/oira_meetings/ White House Office of Management and Budget Meeting Records]
'''From Waldo Jacquith, Virginia Decoded:'''
* [http://www.courts.state.va.us/wpcap.htm Supreme Court of Virginia court decisions]
* [http://www.courts.state.va.us/scndex.htm Virginia Court of Appeals court decisions]
Waldo notes: It's a huge obstacle that I simply haven't put any time into dealing with. Every few months I spend half an hour on trying to put together a system to systematically scrape data out, get discouraged, and give up. Footnotes, blockquotes, and page numbers just kill me, although even if I could get the raw text decently, rendered terribly, I could still extract great metadata from them.
From John Keefe & Stephen Menendez, WNYC:
* [https://dl.dropboxusercontent.com/u/6682410/FY%202013%20Schedule%20C%20-%20Merge%20Final1.pdf 2013 New York City Council budget document (warning large PDF download)]
* [http://www.nyc.gov/html/nypd/html/traffic_reports/motor_vehicle_accident_data.shtml NYPD Motor Vehicle Accident Data]
From Daniel X O'Neil, Smart Chicago Collaborative/Everyblock:
* [http://www.nyc.gov/html/nypd/html/analysis_and_planning/stop_question_and_frisk_report.shtml NYPD Stop, Question and Frisk Report Database]
The data is amazingly detailed ([http://www.jjay.cuny.edu/web_images/PRIMER_electronic_version.pdf here's a great primer]), and lends itself to great visualizations ([http://www.nytimes.com/interactive/2010/07/11/nyregion/20100711-stop-and-frisk.html?ref=stopandfrisk here's one re: 2009 data]). The data itself is published in a highly inaccessible to regular people (notwithstanding the fact that is extremely well-structured as an SPSS portable file. Publishing this info as an easy-to-search, RSS-ready list of items would be high value.
* [https://apps.health.ny.gov/pdpw/SearchDrugs/Home.action Prescription Drug Prices in New York State]
This is a gem of a lookup tool that cries out for scraping and simple display. The disparity in drug prices is often profound, even in the space of a few blocks. This fits into a general new trend/ huge opportunity to call out disparate health care costs, given the [https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Inpatient.html Medicare Provider Charge Data] provided by the U.S. government recently.
* [http://www.lapdonline.org/crime_prevention/content_basic_view/42390 LA Crime Data]
Given the reality of a new mayor in LA, it would be good to look for some data in this enormous city with very little available civic data. It has always bothered me that the LAPD has an exclusive relationship with the LA Times on crime data: [http://www.lapdonline.org/crime_prevention/content_basic_view/42390 http://www.lapdonline.org/crime_prevention/content_basic_view/42390]. Muffing that up might be fun.
* [ftp://66.97.146.93/ Dallas FTP Bulk Crime Database]
This is an enormous, underutilized cache of crime data. Chicago gets lots of attention and plaudits for their crime data, but the Dallas stuff goes even farther back (2000!) and contains narrative that will make your eyes bleed. They have the actual comments typed into the system by actual police officers, including graphic details about horrible crimes and a huge amount of profanity. This is a researcher's treasure chest.


=== Tools & APIs ===
=== Tools & APIs ===
Confirmed users
147

edits

Navigation menu