Skip to content

IIP Roadmap 2016 Fall

Elli Mylonas edited this page May 17, 2017 · 1 revision

Roadmap

Starting in November 2016 IIP is moving into a phase of more intense development, focussing primarily on rewriting the search results interface, integrating the static pages with the dynamic search portion of the site, and providing editing infrastructure that will make it easier to generate Epidoc files that are self contained, and therefore archival.

User Stories

User Story 2: Viewing Search results

This is FINISHED and has GONE LIVE. Some remaining things are:

  • Data cleanup after new formatting is deployed to dev. Site will not look good until that is done, in order to move to production (Carlos, Elli and Gaia) MOSTLY DONE.
  • Proofread and spotcheck extensively

User Story 5: Controlled Vocabularies in Oxygen

As an IIP maintainer, I want to have a mechanism in Oxygen that allows encoders to pull controlled values from an external file without relying on the XML xi:include mechanism. I want to be able to edit the file without having to interact with the code repo, but preferably the data repo. I would like that file to function as a thesaurus.

  • Figure out how to use Oxygen lookup plugin (https://github.com/BCDH/TEI-Completer) and install in IIP framework
  • Develop thesaurus or other controlled vocabulary file which will either live in the data repo or be web accessible
  • Make sure that there is a script so that files can be batch updated if the controlled vocabulary changes
  • Make sure that the thesaurus/vocabulary is archived properly
  • Possibly use a simple file in order to make this work and develop thesaurus over time

Steps to Take

People: Carlos, Elli

The TEI-Completer plugin is straightforward enough, so once our data/resources are all in github, it shouldn't be too much trouble to integrate them. Batch updating files probably warrants its own story.

User Story 4: Archiving IIP inscriptions

As the IIP PI, I need to archive the corpus in the Brown Digital Repository so that it will be accessible, re-usable and preserved for the future. I also want to periodically add to or edit the corpus in light of new inscriptions or new information/errors.

  • Develop process for creating archival versions of the files that have no external dependencies
  • Develop scripts for ingesting Epidoc files into the BDR. Possibly use teiHeader metadata
  • Develop policy for ingesting new inscriptions on a regular basis (every 6 months?)
  • Develop policy for replacing inscriptions that have had to be udpated on a regular basis (every 6 months?)
  • Decide if and how to handle formatting for display in BDR or whether that is appropriate

Steps to Take

People: Joseph, Ann, Elli, Andrew?,

Each of these is probably its own entire project, that require meetings and decisions before they can be timelined.

  • Joseph suggests that the publication and updating be handled by creating tagged git release branch at agreed upon intervals.
  • EM tells Joseph/Ben what features she wants from the teiHeader to be indexed. Note that Carlos already indexes all of this and more, could take advantage of what he's already done.

DONE User story 1: Editing Static Pages

As researchers who maintain this site, the PI and his graduate students/postdocs want to be able to easily edit the static "About" pages on the site.

  • They can learn how to use md.
  • They would like the ability to add new pages.
  • They can link new pages in by creating links in the text (don't need automatic menu creation, for ex.).
  • If a page is to be added as a menu item, this would be requested and handled in the django web interface.

Steps to Take

People: Carlos, Birkin

Assuming we used the existing Django installation to serve and edit static pages,

  • Set up Django to allow for static pages loaded from the database: 3 hours?
  • Add CMS system to allow editing from Django: 1-2 hours? ask Ben + Joseph about static pages in Rome
  • Move existing content into Django CMS: 1 hour (EM)

Total: about 6 hours, 3-4 afternoons, plus time for a lot of decisions to be made and meetings to be had. **This feature is being worked on on the spring that began on Dec. 5. **

Estimate: 2-3 weeks. Mid-Late December.

DONE User Story 2: Viewing Search results

As users of this site (mostly people who read Hebrew, Greek and Latin), we would like to have a cleaner experience when presented with a list of search results.

  • The formatting of the results, especially of the transcribed text should be accurate.
  • The user should see a single list of search results.
  • The search results page should have a visible URL that can be used to recreate the search. DONE

Steps to Take

People: Carlos, Birkin

Already partly in progress

  • Re-design and re-do results list: 3+ hours, depending on design choices and roadblocks along the way
  • Replace existing display code with CETEIcean library + stylesheet: 2 hours
  • Urls should already be bookmark-able, decide if improvements to be made: 0-1 hour
  • Data cleanup after new formatting is deployed to dev. Site will not look good until that is done, in order to move to production (Carlos, Elli and Gaia)
  • Proofread and spotcheck extensively

Total: 5-6 hours, 3-4 afternoons.

Estimate: 1-2 weeks. Early December.

DEMOTED User Story 3: Managing IIP inscriptions

As the encoding manager for IIP, I need to be able to manage finished work in a version control system and trigger indexing for new and old inscriptions

  • The version control system needs to be easy to access for contributors who are at Brown and also outside. (Ideally move to git from svn)
  • The right files should be in the data repo vs the code repo so encoders/managers can intervene with controlled vocabularies and formatting. Same for the solr schema
  • There should be a script that indexes new inscriptions
  • There should be a trigger that re-indexes edited inscriptions and doesn't change their proofreading status.
  • As the proofreader for IIP, I need to be able to see which inscriptions need proofreading, then change to "Approved" or "Needs Work" as appropriate.

Steps to Take

People: Birkin, Carlos, Elli, (Joseph/Ben for brainstorming)

There are a lot of decisions to make design- and deployment-wise here, but this is the first thing we need to do to be able to accomplish a lot of these other stories, so it will likely be the first thing done in January, or concurrently with the other projects this fall. Sprint starting Dec. 5: come up with plan for implementing this feature

User Story 6: Images in IIP

As the IIP PI and also as an IIP user, I would like to incorporate images into the display of an inscription.

  • Thumbnail view available when inscription appears in list of search results
  • Full image displayed with full inscription display
  • Image information encoded in XML file

User Story 7: Maps in IIP

As the IIP PI and also as an IIP user, I would like to have visual geographic information about the inscriptions, and to be able to use a geographic interface to explore and browse them.

  • Display thumbnail with map of the area and little dot for each full inscription display
  • Reincorporate the D3.js geographic interface that Dan Sheibler built, but rework it.

Current architecture

IIP is currently built in on a Django platform, from XML source files, using a Solr index. The static "About" pages reside on a separate server, and are simple HTML pages.

Source texts

The source texts are encoded in XML using the Epidoc customization of the TEI schema [https://sourceforge.net/p/epidoc/wiki/Home/]. They currently use an external file to store values for controlled vocabularies and rely on a Zotero bibliography for citations [https://www.zotero.org/iip/items].

Solr index

The source texts are indexed using Solr. The <teiHeader> metadata is extracted and provides search criteria and facets. All search field and facets reflect live data derived from the source files, so it is not necessary to maintain separate lists. The full text is also indexed.
ACTION: The Solr schema currently resides within the Django universe. It should reside in the git data repo.

Django platform

The search/result part of the site is built in Django and resides on worf.services.brown.edu. The static "about" pages reside on pcdscit.services.brown.edu. We would prefer to work in a unified environment with static pages provided by the Django database as CMS (see above). The Django site essentially consists of 3 pages:

  • the search page
  • the results list pages
  • the inscription display page.