Skip to content

Commit

Permalink
move data directory into package
Browse files Browse the repository at this point in the history
  • Loading branch information
thatbudakguy committed Jan 16, 2020
1 parent 38f4020 commit 340f41f
Show file tree
Hide file tree
Showing 8 changed files with 9 additions and 5 deletions.
1 change: 0 additions & 1 deletion MANIFEST.in

This file was deleted.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ $ dphon --version

## methodology

matching sequences are determined by a dictionary file that represents a particular reconstruction of old chinese phonology (you can see some examples in the `data/` folder). these data structures map an input character to an arbitrary sound token ("dummy") that can be matched against other such tokens.
matching sequences are determined by a dictionary file that represents a particular reconstruction of old chinese phonology (you can see some examples in the `dphon/data/` folder). these data structures map an input character to an arbitrary sound token ("dummy") that can be matched against other such tokens.

the core process of DIRECT is to accept plaintext input, tokenize it according to a particular phonological reconstruction, and search for matches amongst the tokenized text. these matches thus represent resonance: sequences that could have rhymed when they were originally read aloud, despite dissimilarity in their written forms.

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
8 changes: 6 additions & 2 deletions dphon/lib.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
import json
from collections import defaultdict
from typing import List, Dict, Tuple
from os.path import basename, splitext
from typing import Dict, List, Tuple

import pkg_resources

'''Non-alphabetic symbols used in place of a character.'''
CHAR_MARKERS = ['□']

with open('data/dummy_dict.json', encoding='utf-8') as file:
'''Dictionary based on Schuessler's reconstruction of Old Chinese.'''
schuessler_path = pkg_resources.resource_filename(__package__, 'data/dummy_dict.json')
with open(schuessler_path, encoding='utf-8') as file:
DUMMY_DICT = json.loads(file.read())

def phonetic_tokens(string: str) -> str:
Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ def run(self):
long_description=long_description,
long_description_content_type='text/markdown',
url='https://github.com/direct-phonology/direct',
include_package_data=True, # include extra data files, e.g. dictionaries
include_package_data=True,
package_data={'dphon': ['data/*.json']},
author='John O\'Leary, Nick Budak, Gian Rominger',
author_email='jo10@princeton.edu, nbudak@princeton.edu, gianr@princeton.edu',
license='MIT',
Expand Down

0 comments on commit 340f41f

Please sign in to comment.