- The Open Data Movement
- Introduction
- 50 State Comparison Open Data Portals
- Enacted Open Data Legislation in Top States
- Other Helpful resources
- Weaknesses in the Evaluation
- Conclusion
Increased connectivity and greater bandwidth have provided for an ever growing amount of data access. Those who see this as an opportunity to make more informed and quicker decisions have a competitive edge. Governments are a leading purveyor of data which can often be paired with proprietary data for greater insight. In a 2013 report McKinsey and Company estimated that the value of open data could be 3 trillion dollars globally.
Open data is data that is available to the public for their unrestricted use. All levels and branches of government generate significant data. Examples might include hospital utilization rates, commercial vehicle inspections, and first responder dispatches. When released and accessible in machine readable formats, the data may provide socially valuable insights for effective government and commercially viable applications for software entrepreneurs.
The United States renewed its committment to open data with an executive order signed by Barak Obama in 2013. In addition to giving some history of the open data movement, the order included specific implementation goals. One of the goals was for the Office of Management and Budget (OMB) to issue a memorandum implementing the principles in the executive order. The OMB memo directed agencies to use a "machine-readable, open" format, open licenses for end users, and common core, extensible metadata. (The memorandum is helpful and more understandable than statutory provisions.) A more recent development in this area occurred in January, 2019 when President Trump signed the Foundations for Evidence Based Policy Act.
Kentucky provides a portal for GIS data, a second portal for transparency data and a third portal for registering and operating a busienss. While there are many data sets available from different Kentucky agencies, there is no centralized portal as in other states for their dissemination, nor are they provided in many situations in a machine-readable or liquid format. The Center for Data Innovation ("CDI") ranked the states on many criteria. Kentucky was ranked 36th overall. When broken down by subject area, Kentucky was ranked
- 1st for education data
- 9th in transit systems
- 12th in building energy efficiency
- 12th in public access to government information
- 16th in electronic health records
- 20th for government financial data
- 21st in e-government
- 24th in smart meters
- 26th information and data processing
- 28th for e-prescribing
- 30th in consumer devices
- 30th in open data portals
- 30th in statistics jobs
- 32nd in anti_SLAPP data
- 33rd in data science community
- 34th enabling key technology platforms
- 34th in energy usage data
- 34th in computer science and statistics AP
- 35th in open data 500 companies
- 35th federal funding in for data science R & D funding
- 36th overall
- 38th in developing human & business capital
- 38th for data availability
- 39th in software service jobs
- 44th in broadband
- 45th in STEM degrees
- 47th in data science job listings
- 48th for legislative data
Issues regarding technology are assigned to the Commonwealth Office of Technology. KRS 42.724; KRS 42.726. The office was created within the Finance and Administration Cabinet. KRS 42.724. The head of the office is both its executive director and the chief information officer for the state. While appointed by the Secretary of the Finance and Administration Cabinet, the appointment must be approved by the Governor. KRS 42.724.
Among its many responsibilities, the Office of Technology is to develop
- "strategies and policies to support and promote . . . electronic public access to information of the Commonwealth." KRS 42.726(2)(c).
- establish "a central statewide geographic information clearinghouse to maintain map inventories." KRS 42.726(2)(j).
- provide "staff support and technical assistance to the Geographic Information Advisory Council and the Kentucky Information Technology Advisory Council." KRS 42.726(2)(n).
- submit an annual report to the LRC in accordance with KRS 57.390.KRS 42.726(5)
The Commonwealth Office of Technology is advised by two statutorily established boards: the Geographic Information Advisory Council and the Kentucky Information Technology Advisory Council.
Kentucky's executive branch underwent signficant reorganization via an executive order entered by Governor Bevin in 2017. The Executive Director of the Commonwealth Office of Technology declared as the Chief Information Officer of all agencies within the Executive Branch of the state. All "agency information technology leads" are to report to both their current cabinet/ agency head and to the Commonwealth Office of Technology's Chief Information Officer.
Another significant policy was adopted in 2019. CIO-110 names a Chief Data Officer (CDO) to manage the Commonwealth's data. The CDO is responsible for maintaining standards and best practices with regard to data and acts as the Chief Data Steward as well. Executive branch agencies shall designate at least one Agency Data Steward that (1) reports directly to the Cabinet Secretary or Agency Head; (2) acts as the liaison to the CDO/ODIA; and (3) is responsible for data stewardship within the agency.
Within CIO-110, the position of "agency data steward" was created. According to the policy, the responsibilities of the position include the following:
- Assisting in the development of the initial data inventory, including identification of high-value data elements, and shall ensure their agency supports the maintenance of the inventory in a timely fashion.
- Developing and maintain a data source inventory describing and categorizing the data created or collected by the state agency, including geospatial data used in a state agency’s geographic information system (GIS)
- Developing and maintain an open data catalog and machine-readable open datasets
- Developing and maintain an inventory of all interfaces that describes inbound or outbound datasets generated, aggregated, stored, purchased, or shared by the state agency
- Submitting agency’s data warehousing, data analytics, and data visualization plans to the CDO for approval prior to procurement or execution of such activities
- Enforcing Commonwealth Data Management Policy, Data Management Standards, and Data Quality Policies and Standards, within the agency
- Participating in Commonwealth Data Governance and Management workgroups and subcommittees as appropriate
- Participating in Commonwealth Data Plan activities and lead Agency Data Management planning, including the development of roadmaps and tracking, reporting, and managing roadmap implementation.
All data-sharing partners to the Commonwealth, as well, shall designate a Data Steward (or the equivalent) and provide data inventory and maintenance support within the limits of their data sharing agreement.
According to the Open Data Handbook, machine readable data can be automatically read by a computer like csv (comma separated values), json (java script object notation), or xml (extensible markup language). The opposite of machine readable is human readable and would include non-structured data like handwriting and pdfs. While pdfs are digital, a computer struggles to convert pdf tables into a machine readable format. While the conversion is possible, it requires additional steps and can result in errors. (For example, Tabula can convert pdf tables into a csv format, but it is time consuming and designed for smaller data sets.) In the Kentucky Administrative Regulations, "machine-readable" is defined as "information in the form of magnetic code or optical image that can be processed directly by computers and other related machines." 1 KAR 5:010 Accession of public records by means of electronic data processing procedures.
The sites tend to be of three kinds: transparency, geographic information system (GIS) or the more comprehensive open data portal. Much of the table below was prepared by Meta S. Brown, States Offer Information Resources: 50+ Open Data Portals, Forbes (Apr. 30, 2018). In some states, both the sites for GIS and open data were included. The data were then merged with the open data portal rankings by the Center for Data Innovation ("CDI").
There were six top ranked states. The CDI stated that these six states were highlighted because "they have specific open-data policies, open-data portals and machine-readibility written into their open data policies." Incorporating the requirement of machine-readibility into law allows developers immediate access to usable data. The McKinsey report, mentioned above, uses the phrase "liquid" instead of machine-readable. Too often Kentucky's data, where available, is in a pdf format making it very difficult to convert to a machine-readable format.
Hawaii's open data policies are codified in chapter 27. The statute defines "data" and "data sets". Hawaii Rev. Stat. Sec. 27-41.1. It establishes an "information technology steering committee", consisting of eleven members. The 11 members include four from the house, four from the senate, one member appointed by the Chief Justice, one member appointed by the governor, and representatives from the executive branch departments. Hawaii Rev. Stat. Sec. 27-43.
"Each executive branch department shall use reasonable efforts to make appropriate and existing electronic data sets . . . electronically available . . . through the state's open data portal." Hawaii Rev. Stat. Sec. 27-44. The statute specifies that no "new" data sets must be created, nor does it require data sets to be available on demand. It does require that the data sets be updated as "often as is necessary to preserve the integrity and usefulness of the data."
With regard to future liability and potential warranties, the statute states that "[d]ata sets shall be available for informational purposes only. The State does not warrant the fitness of any data set for a particular purpose and shall not be liable for any deficiencies in the completeness or accuracy of any data set, except where the State's conduct would constitute gross negligence, wilful and wanton misconduct, or intentional misconduct." Hawaii Rev. Stat. Sec. 27-44.1.
Hawaii's chief information officer is charged with developing policies and procedures including "[t]echnical requirements with the goal of making data sets available to the greatest number of users and for the greatest number of applications, including whenever practicable, the use of machine readable, nonproprietary technical standards for web publishing; and guidelines for departments to follow in making data sets available." Hawaii Rev. Stat. Sec. 27-44.3.
The adopted statute defines "cloud-computing", "data", "open operating standard", "public data", and "voluntary consensus standards body" among others. The implementation portion of the statute reads, "There is hereby established an open operating standard to be known as the Illinois Open Data, for the State of Illinois. Under this operating standard, each agency of State government under the jurisdiction of the Governor shall make available public data sets of public information." Another section requires that all data be located at a single url address and reads, "[p]ublic data sets that are made available on the internet shall be accessible through a single web portal that is linked to data.illinois.gov."
The data sets must be formatted in a way that allows the public to be notified of updates; updated regularly; made available without registration, license requirements, or usage restrictions; and discoverable via external searches. The statute requires an "API" or application programming interface for sending and retrieving data. The plan includes extensive deadlines on the executive branch. It does not use the term "machine-readable", but rather leaves formats to the discretion of the executive. Cloud-computing receives preference as it is considered a more cost effective solution. The data does not include a warranty of any kind and is assigned to the public domain.
The Maryland statute defined the following terms: data, data portal, data set, mapping and geographic information systems portal, open data, open data portal. The statute declared the state's policy is that "open data be machine readable and released to the public in ways that make the data easy to find, accessible, and usable . . . ." (While the purpose included "machine readable", the term was not included in the definitions). A Council on Open Data was created with 37 members. The Council's main charge was carry out the express policy of the statute and make recommendations on legislation and regulations.
The New York statute contains many definitions including: data, data set, publishable state data, and technical standard. The enacting portion of the statute provides:
An online open data website for the collection and public dissemination of publishable state data is hereby established in the office of information technology services. The open data website shall be maintained at data.ny.gov . . . . The open data website will provide access to publishable state data that is owned, controlled, collected or otherwise maintained by covered state entities.
The statute also provides a time line with a "data working group" to be formed 45 days after the act becomes law. Within 60 days, each agency of the executive branch designate a data coordinator who can act on behalf of the agency. Within 180 days, the Office of Information Technology shall publish guidelines for the purpose of making public data widely available. Within one year, all publishable state data will be available on the web site.
All "publishable state data" shall be "in a format that permits automated processing." Like the Illinois statute, New York requires the data sets to be formatted in a way that allows the public to be notified of updates; updated regularly; made available without registration, license requirements, or usage restrictions; and discoverable via external searches. The statute also requires an application programming interface like Illinois. New York boasts 1600 data sets as of January, 2019.
"Open technology standards", "application programming interface", "performance information metrics" and "convenience information sets" are all defined under the Oklahoma statute. 62 Ok. Stat. Sec. 34.11.1(I).
The Chief Information Officer (CIO) is required to "source and submit" to the "State Governmental Technology Applications Review Board" (hereinafter "Technology Board"), employee performance metrics, convenience information sets and other data streams for publication on Oklahoma's open data portal. 62 Ok. Stat. Sec. 34.11.1.1.1(A).
The Technology Board is to establish "open technology standards" and a schedule for state agencies to adopt the standards. The state agencies are to publish and update data sets which "shall be accessible through standardized application programming interfaces and published in standardized formats" like extensible markup language (XML) and comma separated values (CSV). 62 Ok Stat. Sec. 34.11.2
The open data portal should give priority to those data sets that are most often requested by the public.
Utah updated its approach to public records and data in a 2014 bill. Utah's open record policies are promulgated by a "Transparency Advisory Board" comprised of 13 members and they are charged with specific duties. These duties revolve around what information is included and how it is to be presented to the public.
The 2014 legislation provided for an "information website", meaning a "single Internet website containing public information or links to public information." Regarding the information website, a new responsibility was added but all are set forth in total here:
- study the establishment of an information website and develop recommendations for its establishment;
- develop recommendations about how to make public information more readily available to the public through the information website;
- develop standards to make uniform the format and accessibility of public information posted to the information website; and
- identify and prioritize public information in the possession of a state agency or political subdivision that may be appropriate for publication on the information website.
In a fiscal note, the approximate one-time cost was $75,000 and the annual cost was $540,000.
resources | url | note |
---|---|---|
Best States for Data Innovation | https://www.datainnovation.org/2017/07/the-best-states-for-data-innovation/ | States ranked for technology. |
Center for Data Innovation | https://www.datainnovation.org | Excellent resource. |
Code for America | https://www.codeforamerica.org | Making government work for people. |
NCSL Open Data | http://www.ncsl.org/research/telecommunications-and-information-technology/open-data-legislation.aspx | 2010 through 2018 Legislation. |
Open Knowledge Int'l | https://okfn.org | Global not-for-profit that promotes free, open data. |
White House Open Data Repo | https://github.com/project-open-data | White House Open Data site. |
Socrata | https://socrata.com | Cloud-based database services for public sector. |
Sunlight Found. Open Data Policies | https://sunlightfoundation.com/opendataguidelines/ | Holding gov't accountable. |
U.S. States Open Data Census | https://census.usopendata.org | User Census of State Datasets |
This writeup depends heavily on the CDI's rankings. The legal research focused on those states with a high ranking from the CDI. The organization is is headed by Daniel Castro and affilliated with Information Technology and Innovation Foundation (ITIF). The ITIF is a 501(c)(3) and its latest 990 tax return can be found here. The author is unaware to the degree that this organization is affiliated with any particular industry or political group.
Government data is valuable. It belongs to the public and should be widely distributed, easily accessible and intensely comprehensive. Many states have created a centralized location to serve as a repository for this data; formatted it so that it can be shared; and created a board to shepherd the data to the public. While Kentucky has taken strong steps in the creation of specialized portals, Kentucky can do more. Kentucky should centralize its data in an open data portal in machine readable formats. Creating a data-driven state government can spur innovation, improve decision making, drive government transparency and increase civic engagement.
(To suggest improvements or correct errors, please open an issue on github).