A functional library for Hangeul transliteration
A snapshot of version 0.0.1
is available from Sonatype. It is cross-built for Scala 2.11, 2.12 and 2.13. Just add the following to your build.sbt
:
resolvers += Resolver.sonatypeRepo("snapshots")
libraryDependencies += "com.github.sophiecollard" %% "hangeul4s" % "0.0.1-SNAPSHOT"
This project is currently under development.
- Implement romanization of jamo
- Implement romanization of syllables
- Implement conversion between jamo and syllables
- Implement parsing of Hangeul text
- Add CircleCI integration
- Add Codecov integration
- Cross-build for Scala 2.11, 2.12 and 2.13
- Add Apache 2.0 licence
import hangeul4s.implicits._
import hangeul4s.model.hangeul.HangeulTextElement
import hangeul4s.model.romanization.RomanizedTextElement
val input = "안녕하세요"
// input: String = 안녕하세요
val output = for {
parsed <- input.parseTo[HangeulTextElement]
transliterated <- parsed.transliterateTo[RomanizedTextElement]
} yield transliterated.unparseTo[String]
// output: scala.util.Either[hangeul4s.error.Hangeul4sError,String] = Right(annyeonghaseyo)
import cats.implicits._
import hangeul4s.implicits._
import hangeul4s.model.hangeul.HangeulTextElement
import hangeul4s.model.romanization.RomanizedTextElement
// first sentence of second paragraph of the Korean Wikipedia article on Seoul (retrieved 2019-09-22)
// See https://ko.wikipedia.org/wiki/%EC%84%9C%EC%9A%B8%ED%8A%B9%EB%B3%84%EC%8B%9C
val input = "시청 소재지는 중구이며, 25개의 자치구로 이루어져 있다."
// input: String = 시청 소재지는 중구이며, 25개의 자치구로 이루어져 있다.
val output = for {
parsed <- input.parseToF[Vector, HangeulTextElement]
transliterated <- parsed.transliterateToF[Vector, RomanizedTextElement]
} yield transliterated.unparseTo[String]
// output: scala.util.Either[hangeul4s.error.Hangeul4sError,String] = Right(sicheong sojaejineun jungguimyeo, 25gaeui jachiguro irueojyeo itda.)
This project is an implementation of the revised Hangeul romanization. Transliteration rules currently supported are detailed in the tables below.
Hangul | ㅏ | ㅐ | ㅑ | ㅒ | ㅓ | ㅔ | ㅕ | ㅖ | ㅗ | ㅘ | ㅙ | ㅚ | ㅛ | ㅜ | ㅝ | ㅞ | ㅟ | ㅠ | ㅡ | ㅢ | ㅣ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Romanization | a | ae | ya | yae | eo | e | yeo | ye | o | wa | wae | oe | yo | u | wo | we | wi | yu | eu | ui | i |
Hangul | ㄱ | ㄲ | ㄴ | ㄷ | ㄸ | ㄹ | ㅁ | ㅂ | ㅃ | ㅅ | ㅆ | ㅇ | ㅈ | ㅉ | ㅊ | ㅋ | ㅌ | ㅍ | ㅎ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Romanization | g | kk | n | d | tt | r | m | b | pp | s | ss | – | j | jj | ch | k | t | p | h |
Hangul | ㄱ | ㄲ | ㄳ | ㄴ | ㄵ | ㄶ | ㄷ | ㄹ | ㄺ | ㄻ | ㄼ | ㄽ | ㄾ | ㄿ | ㅀ | ㅁ | ㅂ | ㅄ | ㅅ | ㅆ | ㅇ | ㅈ | ㅊ | ㅋ | ㅌ | ㅍ | ㅎ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Romanization | k | k | k | n | n | n | t | l | k | m | p | l | l | p | l | m | p | p | t | t | ng | t | t | k | t | p | t |
Rows and columns correspond to final and initial consonants, respectively. Final / initial consonants pairs with irregular transliteration are displayed in bold.
F/I | ㅇ | ㄱ | ㄴ | ㄷ | ㄹ | ㅁ |
---|---|---|---|---|---|---|
ㄱ | g | kg | ngn | kd | ngn | ngm |
ㄴ | n | ng | nn | nd | ll, nn2 | nm |
ㄷ | d, j1 | tg | nn | td | nn | nm |
ㄹ | r | lg | ll, nn2 | ld | ll | lm |
ㅁ | m | mg | mn | md | mn | mm |
ㅂ | b | pg | mn | pd | mn | mm |
ㅅ | s | tg | nn | td | nn | nm |
ㅇ | ng | ngg | ngn | ngd | ngn | ngm |
ㅈ | j | tg | nn | td | nn | nm |
ㅊ | ch | tg | nn | td | nn | nm |
ㅌ | t, ch3 | tg | nn | td | nn | nm |
ㅎ | h | k | nn | t | nn | nm |
1 Always transliterated as d in the current implementation
2 Always transliterated as ll in the current implementation
3 Always transliterated as t in the current implementation
Copyright 2019 Sophie Collard <https://github.com/sophiecollard>
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.