Franklin

Introduction to Chinese natural language processing [electronic resource] / Kam-Fai Wong ... [et al.].

Publication:
[San Rafael, Calif.] : Morgan & Claypool Publishers, c2010.
Format/Description:
Book
1 online resource (x, 148 p.) : ill
Series:
Synthesis lectures on human language technologies ; #4.
Synthesis lectures on human language technologies, 1947-4059 ; #4
Status/Location:
Loading...

Options
Location Notes Your Loan Policy

Details

Subjects:
Natural language processing (Computer science).
Chinese language.
Summary:
This book introduces Chinese language-processing issues and techniques to readers who already have a basic background in natural language processing (NLP). Since the major difference between Chinese and Western languages is at the word level, the book primarily focuses on Chinese morphological analysis and introduces the concept, structure, and interword semantics of Chinese words. The following topics are covered: a general introduction to Chinese NLP; Chinese characters, morphemes, and words and the characteristics of Chinese words that have to be considered in NLP applications; Chinese word segmentation; unknown word detection; word meaning and Chinese linguistic resources; interword semantics based on word collocation and NLP techniques for collocation extraction.
Contents:
1. Introduction
What is Chinese NLP
About this book
2. Words in Chinese
Introduction
Characters, morphemes, and words
Characters
Morphemes
Words
Word formation in Chinese
Disyllabic compounds
Trisyllabic compounds
Quadrasyllabic compounds
Other morphological processes in Chinese
Ionization
Word identification and segmentation
Summary
3. Challenges in Chinese morphological processing
Introduction
Chinese characters
Large number of characters
Simplified and traditional characters
Variant characters
Dialect characters and dialectal use of standard characters
Multiple character encoding standards
Textual conventions
Printing format
Punctuation practice
Linguistic characteristics
Few formal morphological markings
Parts of speech
Homonyms and homographs
Ambiguity
OOV words
Regional variation
Stylistic variation
Summary
4. Chinese word segmentation
Introduction
Two main challenges
Algorithms
Character-based approach
Word-based approach
Word segmentation ambiguity
Ambiguity definition
Disambiguation algorithms
Benchmarks
Standards
Bakeoff evaluation
Free tools
Chinese lexical analysis system
MSRSeg system
Summary
5. Unknown word identification
Introduction
Unknown word detection and recognition
Chinese person name identification
Chinese organization name identification
Chinese place name recognition
Summary.
Notes:
Title from PDF t.p. (Morgan & Claypool, viewed Dec. 3, 2009).
Includes bibliographical references (p. 135-145).
Contributor:
Wong, Kam-Fai.
Other format:
Print version: Introduction to Chinese Natural Language Processing.
ISBN:
9781598299335 (ebook)
1598299336 (ebook)
9781598299328 (pbk.)
1598299328 (pbk.)
OCLC:
472468557
Publisher Number:
10.2200/S00211ED1V01Y200909HLT004 doi
Access Restriction:
Restricted for use by site license.