An interface for InaNLP and Deeplearning4j's Word2Vec for Indonesian (Bahasa Indonesia) in the form of REST API.
Below is the screenshot of Pujangga's request and response using Paw REST Client
- Dr. Eng. Ayu Purwarianti, ST.,MT., et al
- Yudi Wibisono, MT.
- InaNLP related resources copied from teaolivia
-
Install scala 2.12.2 and Lightbend Activator
-
Clone the project
$ git clone git@github.com:panggi/pujangga.git
- Download the dependencies
$ cd pujangga
$ activator
-
Pretrained word2vec model can be downloaded here https://drive.google.com/uc?id=0B5YTktu2dOKKNUY1OWJORlZTcUU&export=download
-
Run Application
$ export WORD2VEC_FILE=/path/to/word2vec_wiki_id
$ activator run
- Access on
http://localhost:9000
Request:
POST /stemmer
{
"string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}
Response:
{
"status": "success",
"data": "prof Habibie akan laku kunjung resmi ke pt Pindad di bandung"
}
Request:
POST /phrasechunker
{
"string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}
Response:
{
"status": "success",
"data": {
"map": {
"Pindad ": "NP",
"Prof. Habibie ": "NP",
".": ".",
"di Bandung ": "PP",
"akan melakukan kunjungan resmi ke PT ": "VP"
},
"list": [
"NP",
"VP",
"NP",
"PP"
]
}
}
Request:
POST /postagger
{
"string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}
Response:
{
"status": "success",
"data": {
"map": {
"resmi": "JJ",
".": ".",
"akan": "MD",
"ke": "IN",
"di": "IN",
"Bandung": "NNP",
"Pindad": "NNP",
"PT": "NN",
"Prof.": "NNP",
"kunjungan": "NN",
"Habibie": "NNP",
"melakukan": "VBT"
},
"list": [
"NNP",
"NNP",
"MD",
"VBT",
"NN",
"JJ",
"IN",
"NN",
"NNP",
"IN",
"NNP"
]
}
}
Request:
POST /netagger
{
"string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}
Response:
{
"status": "success",
"data": [
"OTHER",
"PERSON-B",
"OTHER",
"OTHER",
"OTHER",
"OTHER",
"OTHER",
"LOCATION-B",
"OTHER",
"PERSON-B",
"OTHER",
"LOCATION-B"
]
}
Request:
POST /formalizer
{
"string": "Sis, lu bisa nggak pesenin gw sepatu newbalance tipe 960? gpl ya. hati2 sama penipuan anak 4l4y"
}
Response:
{
"status": "success",
"data": "Sis , kamu bisa tidak pesankan saya sepatu newbalance tipe 960 ? tidak pakai lama iya . hati-hati sama penipuan anak norak "
}
Request:
POST /stopwords
{
"string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}
Response:
{
"status": "success",
"data": "Prof. Habibie kunjungan resmi PT . Pindad Bandung "
}
Request:
POST /sentence/tokenizer
{
"string": "Saya pergi ke (bagian kanan) rumah sakit Prof. Dr. Soerojo."
}
Response:
{
"status": "success",
"data": [
"Saya",
"pergi",
"ke",
"(",
"bagian",
"kanan",
")",
"rumah",
"sakit",
"Prof.",
"Dr.",
"Soerojo",
"."
]
}
Request:
POST /sentence/tokenizer/composite
{
"string": "Saya pergi ke (bagian kanan) rumah sakit Prof. Dr. Soerojo."
}
Response:
{
"status": "success",
"data": [
"Saya",
"pergi",
"ke",
"(",
"bagian kanan",
")",
"rumah sakit",
"Prof.",
"Dr.",
"Soerojo",
"."
]
}
Request:
POST /sentence/splitter
{
"string": "Michael Jeffrey Jordan dilahirkan di Brooklyn, New York, Amerika Serikat, pada 17 Februari 1963 adalah pemain bola basket profesional asal Amerika. Michael Jordan merupakan pemain terkenal di dunia dalam cabang olahraga itu. Setidaknya ia enam kali merebut kejuaraan NBA bersama kelompok Chicago Bulls (1991-1993, 1996-1998). Ia memiliki tinggi badan 198 cm dan merebut gelar pemain terbaik."
}
Response:
{
"status": "success",
"data": [
"Michael Jeffrey Jordan dilahirkan di Brooklyn, New York, Amerika Serikat, pada 17 Februari 1963 adalah pemain bola basket profesional asal Amerika .",
"Michael Jordan merupakan pemain terkenal di dunia dalam cabang olahraga itu .",
"Setidaknya ia enam kali merebut kejuaraan NBA bersama kelompok Chicago Bulls (1991-1993, 1996-1998) .",
"Ia memiliki tinggi badan 198 cm dan merebut gelar pemain terbaik ."
]
}
Request:
POST /word2vec/nearestwords
{
"string": "mobil",
"n": 10
}
Response:
{
"status": "success",
"data": [
"motor",
"dikendarai",
"sepeda",
"truk",
"motornya",
"mengemudikan",
"mobil-mobil",
"mobilnya",
"mengendarai",
"pengemudi"
]
}
Request:
POST /word2vec/arithmetic
{
"first_string": "serang",
"second_string": "malang",
"third_string": "surabaya",
"n": 10
}
Response:
{
"status": "success",
"data": [
"serang",
"lebak",
"puloampel",
"keserangan",
"bogor",
"waringinkurung",
"jawilan",
"cianjur",
"garut",
"padarincang"
]
}
Request:
POST /word2vec/similarity
{
"first_string": "sore",
"second_string": "petang"
}
Response:
{
"status": "success",
"data": 0.7748607993125916
}
All files in libs
and resource
directories are the property of Dr. Eng. Ayu Purwarianti, ST.,MT., et al and not part of the license below (Apache License, Version 2.0).
All other custom codes made by Panggi Libersa Jasri Akadol are licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.