skip to main |
skip to sidebar
Unicode Standard Annex #29, Unicode Text Segmentation, will be updated for Unicode 9.0. A draft of the proposed update is available for general public review and comment.
The Word_Break classification of U+202F NARROW NO-BREAK SPACE (NNBSP) is revised to correct the text segmentation behavior of U+202F for Mongolian usage. For further background on this issue and possible ways to address it, see PRI #308, Property Change for U+202F NARROW NO-BREAK SPACE (NNBSP).
In this revision, the formerly empty Prepend class of the Grapheme_Cluster_Break property is redefined to consist of all prefixed format control characters and a few other characters with certain Indic_Syllabic_Category property values.
The corresponding property value changes will be incorporated in the UCD data files for Unicode 9.0.
Unicode CLDR 28 provides an update to the key building blocks for software
supporting the world's languages. This data is used by all
major software
systems for their software internationalization and localization, adapting
software to the conventions of different languages for such common software
tasks. The following summarizes the main improvements in the release.
The Unicode Consortium is pleased to announce that Facebook has joined as a full member.
Founded in 2004, Facebook’s mission is to give people the power to share and make the world more open and connected. People use Facebook to stay connected with friends and family, to discover what’s going on in the world, and to share and express what matters to them.
We look forward to their contributions to Unicode projects and are grateful for their financial support of the consortium’s work. Full members of the consortium have a vote in all technical committees, and in the governance of the consortium. See the complete list of members.
The Unicode Consortium is pleased to announce that Emoji One has joined as a supporting member. Emoji One is a small, independent group of emoji developers providing an open source emoji set for digital and non-digital use worldwide.
Emoji One is very motivated to support emoji standards, creativity, and innovation to the best of their abilities. Rick Moby, Founder, has said, “We’re honored to be welcomed and included with this unique group of individuals responsible for the emoji and internationalization standards that are so vital to the community.” For more, see Emoji One’s announcement.
We look forward to their contributions to Unicode projects, and are grateful for their financial support of the consortium’s work. Supporting members of the consortium have a half vote in all technical committees. See the complete list of members.
Posts
Posts
Wednesday, September 30, 2015
UAX #29, Unicode Text Segmentation, update to improve Mongolian word segmentation
The Word_Break classification of U+202F NARROW NO-BREAK SPACE (NNBSP) is revised to correct the text segmentation behavior of U+202F for Mongolian usage. For further background on this issue and possible ways to address it, see PRI #308, Property Change for U+202F NARROW NO-BREAK SPACE (NNBSP).
In this revision, the formerly empty Prepend class of the Grapheme_Cluster_Break property is redefined to consist of all prefixed format control characters and a few other characters with certain Indic_Syllabic_Category property values.
The corresponding property value changes will be incorporated in the UCD data files for Unicode 9.0.
Thursday, September 17, 2015
CLDR Version 28 Released
- General locale data. Overall, about 5% of the data items in this release are new (see Growth), while about 8% have corrections. Notable changes include a major review of and improvement to Spanish locales for Latin America; the addition of two new “modern-coverage” locales (Belarusian and Irish); and moving certain data from en_GB to en_001 for improved quality and reduced data size in locales that use en_GB conventions.
- Formatting. There are a number of new units and types of formats, with a major revision to the day period rules—preferred for many languages instead of AM/PM (“10:30 at night”)—with localizations; the addition of compact formatting for currencies (“€10M”, “€10 million”), and the addition of more unit measures, including 7 new general units (duration-century), 21 new per-unit types, 4 new units for measuring personal age (needed for some languages), and new coordinate units for formatting latitude and longitude across languages (“10°N”).
- Identifiers. The new features extend the ability to specify subregions of countries, validate identifiers, and customize locales, including the addition of subdivisions of countries, such as Scotland and California (localized names are not yet present, except for English); the addition of validity data for currency codes, measurement units, and locale identifier elements (allowing validation of Unicode language and locale identifiers without requiring BCP47 data); the addition of seven -u- extension keys and corresponding types to allow customization of locales (“cf” for specifying standard vs accounting currency formats), and the clarification of the specification of identifiers, especially for validity testing.
Tuesday, September 15, 2015
Facebook Joins as Full Member of the Unicode Consortium
Founded in 2004, Facebook’s mission is to give people the power to share and make the world more open and connected. People use Facebook to stay connected with friends and family, to discover what’s going on in the world, and to share and express what matters to them.
We look forward to their contributions to Unicode projects and are grateful for their financial support of the consortium’s work. Full members of the consortium have a vote in all technical committees, and in the governance of the consortium. See the complete list of members.
Monday, September 14, 2015
Emoji One Joins the Unicode Consortium
Emoji One is very motivated to support emoji standards, creativity, and innovation to the best of their abilities. Rick Moby, Founder, has said, “We’re honored to be welcomed and included with this unique group of individuals responsible for the emoji and internationalization standards that are so vital to the community.” For more, see Emoji One’s announcement.
We look forward to their contributions to Unicode projects, and are grateful for their financial support of the consortium’s work. Supporting members of the consortium have a half vote in all technical committees. See the complete list of members.
Subscribe to:
Comments (Atom)
Links of Interest
Blog Archive
Labels
CLDR
(78)
emoji
(75)
Unicode
(42)
ICU
(36)
AAC
(18)
beta
(17)
alpha
(13)
IUC
(12)
UTR #51
(11)
adopt-a-character
(11)
9.0
(10)
POD
(10)
conference
(10)
LDML
(9)
The Unicode Standard
(9)
UTS #51
(9)
Gold Sponsor
(8)
ICU4X
(8)
bidi
(8)
paperback
(8)
Arabic
(7)
IVD
(7)
UTC
(7)
UTS #18
(7)
UTS #46
(7)
Unicode 16.0
(7)
cover art
(7)
Collation
(6)
Survey Tool
(6)
UTS #10
(6)
UTS #39
(6)
Unicode 14
(6)
board of directors
(6)
cldr 43
(6)
locales
(6)
10646
(5)
7.0
(5)
8.0
(5)
SEI
(5)
emoji 12.0
(5)
membership
(5)
regular expression
(5)
security
(5)
unicode 15.1
(5)
10.0
(4)
CJK
(4)
CLDR 26
(4)
CLDR 36
(4)
CLDR 37
(4)
CLDR 39
(4)
CLDR 44
(4)
IDNA
(4)
Mayan
(4)
Rust
(4)
UAX #9
(4)
UTR #50
(4)
UTW
(4)
Unicode 12
(4)
Unicode 13.0
(4)
cldr 38
(4)
cldr 40
(4)
cldr 41
(4)
cldr 42
(4)
emoji 15.0
(4)
regex
(4)
repertoire
(4)
vertical text
(4)
11.0
(3)
12.0
(3)
Bob Jung
(3)
CLDR 35
(3)
CLDR 45
(3)
CLDR 46
(3)
FFI
(3)
Greg Welch
(3)
I18n
(3)
Jennifer Daniel
(3)
Mark Davis
(3)
UAX #29
(3)
UCA
(3)
UTS #37
(3)
Unicode 11
(3)
Unicode 12.1
(3)
Unicode 13
(3)
adoption
(3)
board
(3)
candidates
(3)
cldr 32
(3)
cldr 33
(3)
cldr 34
(3)
core specification
(3)
diversity
(3)
emoji 11.0
(3)
emoji 5.0
(3)
flags
(3)
keynote
(3)
officers
(3)
properties
(3)
reiwa
(3)
schedule
(3)
spoofing
(3)
tutorial
(3)
webinar
(3)
13.0
(2)
14.0
(2)
Addison Phillips
(2)
Adobe-Japan1
(2)
Alolita Sharma
(2)
Anshuman Pandey
(2)
BCP47
(2)
Berkeley
(2)
Beta Review
(2)
CLDR 24
(2)
CLDR 30
(2)
Cherokee
(2)
DDL
(2)
ESC
(2)
Egyptian hieroglyphs
(2)
Elymaic
(2)
Emoji2019
(2)
Extension G
(2)
Georgian
(2)
Google
(2)
Hanifi Rohingya
(2)
ICU 62
(2)
ICU 72
(2)
ICU 73
(2)
IUC 37
(2)
IUC 38
(2)
IUC 41
(2)
IUC 42
(2)
IUC 43
(2)
IUC 45
(2)
Japanese era
(2)
Kristi Lee
(2)
MSARG
(2)
Message Format Working Group
(2)
Microsoft
(2)
Moji Jōhō Kiban
(2)
Moji_Joho
(2)
Nandinagari
(2)
PDAM
(2)
Peter Constable
(2)
RGI
(2)
Roozbeh Pournader
(2)
Salesforce
(2)
Sunuwar
(2)
Teresa Marshall
(2)
Toral Cowieson
(2)
UAX
(2)
UAX #31
(2)
UAX #38
(2)
UAX #44
(2)
UTR #36
(2)
UTR #53
(2)
UTW2024
(2)
Unicode 15
(2)
Unicode Technology Workshop
(2)
Unihan
(2)
Vint Cerf
(2)
World Emoji Day
(2)
award
(2)
bidirectional
(2)
bulldog
(2)
calendar
(2)
candidate
(2)
design
(2)
egyptian
(2)
emoji 13.0
(2)
emoji 13.1
(2)
event
(2)
frequency
(2)
grant
(2)
holiday
(2)
ideographic
(2)
internationalization
(2)
keyboard
(2)
message format 2
(2)
person names
(2)
script
(2)
script_extensions
(2)
scripts
(2)
source code
(2)
standards
(2)
unicode 14.0
(2)
15.0
(1)
2021
(1)
6.3
(1)
AMTRA
(1)
Adlam
(1)
Adobe
(1)
Andy Heninger
(1)
Anne Gundelfinger
(1)
Apple
(1)
Arika Okrent
(1)
Babel
(1)
Bhojpuri
(1)
Bravanese
(1)
Brent Getlin
(1)
CJK Radical
(1)
CLDR 23
(1)
CLDR 25
(1)
CLDR 27
(1)
CLDR 28
(1)
CLDR 29
(1)
CLDR 33.1
(1)
CLDR 36.1
(1)
CLDR 47
(1)
CLDR 48
(1)
CLDR 50
(1)
CLDR-TC
(1)
Caddo
(1)
CanadaDay
(1)
Carlos Pallan Gayol
(1)
Carrier
(1)
Cathy Wissink
(1)
Chorasmian
(1)
Chuvash
(1)
DAM 1
(1)
DNS
(1)
Dachuan Zhang
(1)
David Singer
(1)
Dhives-Akuru
(1)
Dives Akuru
(1)
Dogri
(1)
Du Lilyu
(1)
Ebrima
(1)
Elango Cheran
(1)
Emoji 14.0
(1)
Emoji One
(1)
Emoji12
(1)
Eric Muller
(1)
Extension I
(1)
FAQ
(1)
Facebook
(1)
French
(1)
Fulani
(1)
Gabee Ayres
(1)
Gabrielle Vail
(1)
Garay
(1)
Georgian Mtavruli
(1)
GivingTuesday
(1)
Gonggong
(1)
Gretchen McCulloch
(1)
Gurung Khema
(1)
Hanyo Denshi
(1)
Harald Alvestrand
(1)
Haryanvi
(1)
Haumea
(1)
Hindi
(1)
Hinglish
(1)
Huijun Shan
(1)
IAU
(1)
IBM
(1)
ICU 58
(1)
ICU 59
(1)
ICU 63
(1)
ICU 64
(1)
ICU 65
(1)
ICU 66
(1)
ICU 67
(1)
ICU 68
(1)
ICU 69
(1)
ICU 70
(1)
ICU 71
(1)
ICU 74
(1)
ICU 75
(1)
ICU 76
(1)
ICU 78
(1)
ICU4X 1.3
(1)
IDC
(1)
IDS
(1)
IRG
(1)
IUC 39
(1)
IUC 40
(1)
IUC IUC 39
(1)
Igbo
(1)
Indigenous
(1)
Iris Orriss
(1)
JSON
(1)
Japan
(1)
Jennifer 8 Lee
(1)
Jeremy Burge
(1)
John H. Jenkins
(1)
KRName
(1)
Kaktovik Numerals
(1)
Kangxi
(1)
Kashmiri
(1)
Kawi
(1)
Khitan
(1)
Khwarezmian
(1)
Kirat Rai
(1)
Kulpreet Chilana
(1)
LDML Keyboard
(1)
LanguagePreservation
(1)
Lari
(1)
Linkification
(1)
Luce Foundation
(1)
Macao
(1)
Maithili
(1)
Makemake
(1)
Malayalam
(1)
Manat
(1)
Manipuri
(1)
Mark Jamra
(1)
Mazahua
(1)
Medefaidrin
(1)
Michele Coady
(1)
Monica Tang
(1)
NEH
(1)
Nag Mundari
(1)
Naija
(1)
National Endowment for the Humanities
(1)
Nattilik
(1)
Ned Holbrook
(1)
Nepal Bhasa
(1)
Neptune
(1)
Netflix
(1)
New Tai Lue
(1)
Nigerian Pidgin
(1)
Nigerian-Pidgin
(1)
Norbert Lindenberg
(1)
Norwegian
(1)
Nyiakeng Puachue Hmong
(1)
Ojibway
(1)
Ol Onal
(1)
Orcus
(1)
Osage
(1)
PDAM 2.2
(1)
PRI #359
(1)
PRI #365
(1)
PRI #366
(1)
PRI #408
(1)
PRI #418
(1)
PRI #435
(1)
Pahlavi
(1)
Peter Edberg
(1)
Phoreus
(1)
Pluto
(1)
Public Review Issues
(1)
QID
(1)
Quaoar
(1)
RBNF
(1)
Rajasthani
(1)
Rathna Ramanathan
(1)
Rohingya
(1)
Ruble
(1)
SC2
(1)
SCWG
(1)
Saagar Setu
(1)
Salvatore Giammarresi
(1)
Sanskrit
(1)
Santali
(1)
Sayisi
(1)
SignWriting
(1)
Sindhi
(1)
Sinhala
(1)
Siyaq
(1)
Sogdian
(1)
Stanford
(1)
Stanford SILICON
(1)
Support Unicode
(1)
Swiftkey
(1)
Syloti Nagri
(1)
TNO
(1)
Tableaux des caractères
(1)
Tangsa
(1)
Tayfun Karadeniz
(1)
Thomas Mullaney
(1)
Todhri
(1)
Tom Mullaney
(1)
Toto
(1)
Tulu-Tigalari
(1)
Typotheque
(1)
UAX #14
(1)
UAX #15
(1)
UAX #45
(1)
UCA UCD
(1)
UCD
(1)
UTC #175
(1)
UTC #177
(1)
UTC #179
(1)
UTC #180
(1)
UTC #181
(1)
UTC #182
(1)
UTR #23
(1)
UTS #35
(1)
UTS #52
(1)
UTS #55
(1)
Uighur
(1)
Unicode 15.0
(1)
Unicode 16
(1)
Unicode 17.0
(1)
Unicode Fellows
(1)
Unicode Technical Committee
(1)
UnicodeEmoji
(1)
UnicodeEmojiMirror
(1)
Vithkuqi
(1)
Wancho
(1)
Warsh
(1)
Webdings
(1)
Wingdings
(1)
Xhosa
(1)
Yezidi
(1)
Youtube
(1)
ZWJ
(1)
Zawgyi
(1)
Znamenny
(1)
alpha review
(1)
amendment
(1)
annotations
(1)
art
(1)
astronomy
(1)
beta 6.3 bidi
(1)
bloomberg
(1)
cambridge
(1)
character property model
(1)
cldr 31
(1)
cldr 35.1
(1)
community engagement
(1)
compatibility
(1)
conjoining form
(1)
corrigendum
(1)
currency
(1)
customization
(1)
directionality
(1)
document register
(1)
domain names
(1)
donations
(1)
draft
(1)
dwarf planets
(1)
emoji 12.1
(1)
emoji 16.0
(1)
emoji proposal
(1)
emojixpress
(1)
era name
(1)
executive director
(1)
family
(1)
feedback
(1)
flag
(1)
font
(1)
française
(1)
gender
(1)
general category
(1)
general counsel
(1)
glyphs
(1)
grafematik
(1)
graphemics
(1)
guide
(1)
hashtag
(1)
hentaigana
(1)
hieroglyphs
(1)
highlights
(1)
icu 60
(1)
icu 61
(1)
icu 64.2
(1)
ideographic description characters
(1)
interview
(1)
iuc 44
(1)
keyboards
(1)
language
(1)
locale
(1)
maya
(1)
mongolian
(1)
myanmar
(1)
noncharacters
(1)
oman
(1)
participation
(1)
person-names
(1)
phone
(1)
planning
(1)
playlist
(1)
policies
(1)
publication
(1)
publishing
(1)
quick start
(1)
reference code
(1)
release
(1)
resources
(1)
segmentation
(1)
shopify
(1)
smiley face
(1)
soyombo
(1)
space
(1)
speaker
(1)
sponsor
(1)
stability policies
(1)
submission
(1)
syllabics
(1)
symbol
(1)
technical preview
(1)
text segmentation
(1)
turkey
(1)
typography
(1)
unicodeaac
(1)
valentines day
(1)
variation
(1)
workshop
(1)
文字情報盤
(1)
Followers
Subscribe to this blog