A TYPO3 CMS extension that provides Apache Tika functionality including
- text extraction
- meta data extraction
- language detection (from strings or files)
Tika can be used as standalone Tika app/jar, Tika server, and via SolrCell integrated in Apache Solr.
We're open for contributions !
Please find further information regarding Apache Tika on the project's homepage
Powered by the TYPO3 community and
We use GitHub Actions for continuous integration.
To run the test suite locally, please use our DDEV docker environment https://github.com/TYPO3-Solr/solr-ddev-site.
Note: This requires a proper combination of branches:
- solr-ddev-site on main branch
- packages/ext-solr on main
- packages/ext-tika on main
- Please refer to version matrix for proper combination of branches
ddev solr:enable tika
ddev composer t3:standards:fix packages/ext-tika/
ddev composer tests:tika:phpstan
ddev composer tests:tika:unit
ddev composer tests:tika:integration
- Fork the repository
- Clone repository
- Create a new branch
- Make your changes
- Commit your changes to your fork. In your commit message refer to the issue number if there is already one, e.g.
[BUGFIX] short description of fix (resolves #4711)
- Submit a Pull Request (here are some hints on How to write the perfect pull request)
- git remote add upstream https://github.com/TYPO3-Solr/ext-tika.git
- git fetch upstream
- git checkout master
- git merge upstream/master
- git push origin master