Microsoft word - tothb-rceas2007.doc
Speech Enabled GPS Based Navigation System
in Hungarian for Blind People on Symbian Based
Mobile Devices
B. Tóth, G. Németh
Budapest University of Technology and Economics, Department of Telecommunications and Media Informatics,
Magyar tudósok körútja 2., Budapest, 1117, Hungary
Phone: (36)-(1)-4633883, {toth.b,
[email protected]}
Abstract –The aim of the present study is to create a
speech enabled GPS based navigation system for blind people.
II. PROBLEM STATEMENT
The speech user interface was designed in consultation with
the Hungarian Association of Blind and Visually Impaired A. Present Solutions
People. The application will satisfy the special needs of the
target user group. It is also a particular aim to use only easily
There are several solutions for navigation systems with
accessible, low budget mobile devices. For this reason speech output. Let us investigate the most important ones.
Symbian S60, 2nd edition devices and GPS receivers with
BELATSZ is an acronym for ‘Beszélő térkép
Bluetooth radio link were chosen as hardware of the LÁTássérültek Számára' (Speaking map for visually
navigation system.
impaired people) in Hungarian. It was developed by Topolisz Ltd. and runs under MS DOS compatible systems.
The user has to enter the beginning and end point of a route
in Budapest (the capital of Hungary), and the application
Smart mobile devices are increasingly common as their generates the whole route precisely step-by-step. The blind
price decreases. The performance and storage size of these user can then listen to these instructions with a screen devices makes them capable to run complex calculations, reader application, like JAWS for Windows [4]. This is not such as speech synthesis and GPS data processing. It is a perfect solution, but at least blind people can get some favorable to use easily accessible devices tailored to the help when they learn new routes. users' needs in contrast to task specific hardware components, as the price of the former solution is moderate.
MoBIC, which is an acronym for Mobility of Blind and
Elderly people Interacting with Computers, was a European
Unfortunately blind and visually impaired people are Union project between 1994 and 1996. With MoBIC users
often not well supported. It is hard or even impossible for were able to plan routes and with speech output it was them to use new technologies, like cellular phones, possible to navigate. Because at the time of the navigation systems, as these devices typically have development just low-complexity mobile devices were graphical user interfaces only. There are some existing available, the system runs on desktop and laptop computers solutions for speech output on mobile devices, like Nuance only. A GPS receiver was connected to the computer. Talks [1], which is a screen reader for Symbian based Consequently mobility was rather low, although user tests devices. Some new smartphones, like the HTC Kaiser and were carried out. These tests showed that additional the Nokia N70 have basic speech synthesis and recognition information, like the precise coordinates of the entrance features, but they support only English or some other major should be included in the database. Unfortunately the languages.
project was stopped at 1996.
There are existing navigation systems for blind people,
Brunel Navigation System for the Blind [5] is developed
like BELATSZ, StreetTalk, MoBIC (Mobility of Blind and at the University of Brunel, United Kingdom. They use Elderly people Interacting with Computers) [2], Drishti besides speech other modalities, like Tugs, which was also [3,7], WayFinder with Nuance Talks [1], but a mobile developed in their laboratory. Tugs has five outputs, which solution is not present for blind people in Hungarian yet.
should be attached to different parts of the human body,
Our main goal is to create a mobile navigation system and if it is activated, then it vibrates. With this technology
based on Hungarian speech user interface, which helps the system can tell the user which direction to go, without blind and visually impaired people to navigate in their using any audio output. The system includes on the client everyday run in urban environments.
side a mobile device, a GPS receiver, an electronic compass and a video camera. On the server side there is a map database, a DGPS server and a processing unit. All the information is sent from the client to the server, where it is
processed and the result is sent back to the client, where it
We can conclude that GPS is applicable for our purpose,
is read or signaled by Tugs. It is possible to turn the camera but only for outdoor environments. Usually in buildings the on; then an operator on the server side can help the blind GPS signal is lost, consequently we cannot tell the position user via voice. The main disadvantages of this system are of the user. There are several studies, where indoor continuous data transmission and the requirement of an positioning is solved by ultrasonic sensors or by RFIDs. At operator.
the time of writing, none of these solutions are widespread
or available in smartphones, so indoor navigation was
StreetTalk [6], developed by Freedom Scientific, runs on excluded from the current paper.
PAC Mate, which is a Pocket PC based mobile device
tailored to the needs of blind users. It can either be
C. User Group controlled by a 20 key special keyboard or by a QWERTY
keyboard. StreetTalk connects to the GPS receiver through
Before the development started, the Hungarian Blind and
Bluetooth, and it is based on the Destinator navigation Visually Impaired People's Association was consulted. system. StreetTalk's features include route planning, but According to their experience, blind people mostly use maps for Hungary aren't available yet. Furthermore the routes, which are well known by them. To learn the route, PAC Mate device is expensive.
they need an additional person, who can help them. At the
end of the learning process the blind person knows exactly
Drishti [3,7] is an outdoor navigation system with speech what to do next at each step.
input and output, developed by the University of Florida. It
is desktop and laptop computer based and it uses This process is easier for those, who were born blind, and Differential GPS (DGPS). With the help of DGPS, as their it is harder for people, who have later or recently lost their tests showed, they could achieve an accuracy up to 22 cm. vision. According to the association's opinion, a speech One of the developers' main aspects was to handle dynamic based navigation software is most beneficial for the first environment variables, like road constructions.
group but it can also help the other group during the
learning process. If for some reason a blind person has to
Trekker was first developed by VisuAid (Canada), and take a new route, then a reliable navigation system could be
later, from the year 2003 HumanWare continued its very beneficial. development. It is a Windows Mobile based navigation
software which communicates with the GPS receiver via
Another aspect is long distance travel on buses and trains.
Bluetooth. It has an advanced POI (Points of Interest) and The name of the actual station is not always said on buses, map database, but unfortunately only for the United States. and almost never said on trains (except InterCity trains).
Furthermore buses may pass stops, when nobody is getting
Trinetra [8] was developed at the Carnegie Melon on or off; trains may stop, when there isn't any stop (e.g.
University. It runs on smartphones, and for positioning it waiting for another train to pass). So blind people cannot be uses GPS and RFID, where available. It uses speech output certain about when to get on/off even if they count the and has some features to enable blind users to use public number of stops. The association proposed to include in the transport, if the vehicle is RFID capable. Trientra uses system information on bus and trains stops and stations. client-server architecture, consequently at least a GPRS connection is required.
More examples are discussed in [9]. As it was described
there are already mobile based application and navigation
In this chapter the proposed NaviSpeech system, its
systems for blind people, but unfortunately none of them is architecture, the features and challenges of creating a available in Hungarian. Therefore our aim is to create a Speech User Interface (SUI) on mobile devices and Human Hungarian system with the latest technologies available.
Computer Interaction (HCI) issues are investigated. A more
detailed description of the overall system architecture can be read in [9].
B. Global Positioning System
A. Features
The question could be raised: is GPS applicable for
NaviSpeech already includes several features and new
defining the position in a navigation system for blind ones will be also implemented in the near future (see
section IV for more details). Most of these features were
Unfortunately GPS for public use is not very accurate, requested by the target user group, and all of them were
which makes it harder to define the precise position of the supervised and accepted by blind users. Currently
user. Because of the inaccuracy, at the first glance we NaviSpeech has the following main characteristics:
should say no, it is not applicable for our aims. It might
even easily navigate the blind user e.g. from the pavement
Complete Speech User Interface, including speech
enabled multi-level menus, shortcut keys, automatic
Fortunately this inaccuracy can be quite well navigation; information on demand (next waypoint,
compensated by map databases, by algorithms (like sliding previous waypoint, etc. – more information can be found
window) and by applying additional devices, like a below), help system, options, additional features (time,
compass or a step counter. The accuracy may also be date, coordinates).
increased by applying DGPS.
widely supported by GPS devices. The GTM format is
In case of speech based navigating through a route the public available.2
system informs the user with synthesized voice about the
distance and the directions of the next waypoint. After
NaviSpeech employs its own format, where the name of
getting to the waypoint the next point is always the waypoints and the coordinates are entered in a text file.
automatically set until the final waypoint is reached.
An example text file is given below:
The direction of a route can be changed when navigating #comment: walking around the Informatics building at BUTE
back on the path to the first waypoint.
01 Informatics Building, north-west 19.05922 47.47285
02 Informatics Building, north-east 19.06057 47.47319
03 Informatics Building, south-east 19.06282 47.47284
04 Informatics Building, south-west 19.05977 47.47212
If the user approaches a waypoint 20 meters or nearer,
then NaviSpeech automatically tells the name of this waypoint and the name of the next waypoint, furthermore B. Architecture
the direction to the next waypoint.
Symbian Series 60, 2nd Edition devices were chosen as
the target platform. The main reason was that these devices
The application automatically alerts when the route is left. are available for a moderate price nowadays and they
There are several options, how to navigate back to the route possess the necessary communication interfaces. A GPS
or to the next/previous waypoint: go to the first waypoint receiver is paired with the smartphone via a Bluetooth
directly and then navigate through the route; go to the wireless connection. The GPS receiver connects to the
nearest waypoint and then navigate to the last waypoint; go available satellites and transmits its signals to smartphones,
to the nearest waypoint and then navigate to the first which are compatible with the NMEA-0183 protocol. The
waypoint; go to the nearest point of the route and then NaviSpeech software runs on the mobile device.
navigate to the last waypoint; go to the nearest point of the NaviSpeech processes the received information, it
route and then navigate to the first waypoint. The last case calculates the current position and the possible errors,
is shown in Figure 1.
according to the route and the path the user has already
walked along, and tells the user in a given period - or on request - which way to go. The main architecture of the software can be seen in Figure 2.
Fig. 1. Going back to the nearest point of the route
and navigating back to the first waypoint
NaviSpeech can read the name of the next waypoint and
the distance to the next waypoint. The application can also
Fig. 2. The main architecture of NaviSpeech
tell on request the direction of the next waypoint from the
actual position and how much the user should walk to
As the next step the programming architecture of the
reach the next waypoint. The route can be changed on the application is introduced.
fly: the user can choose the waypoint NaviSpeech should
navigate him/her to, furthermore the nearest waypoint can
The SpeechPlayer class controls the text-to-speech
engine, which is called Profivox. Profivox was developed
in the authors' laboratory. It is a diphone-based speech
The user can get the direction s/he is heading for. The synthesizer (see subsection III./E. for more details). This
current direction is calculated from the average direction of class initiates Profivox; sets the volume, speed, type of the
the last five seconds with a sliding window. If there is a voice; from text it synthesizes speech, which is stored in a
radical change in the direction then the application takes heap descriptor; the waveform from the heap descriptor can
into account only the points after the change. The compass be played; and this function call also close/deinitiate
feature can be turned on and off from the menu. If it is Profivox.
turned on, it is read on demand.
The BTClientEngine class connects to the GPS receiver
General GPS information can be read (longitude, latitude, and reads the data through an emulated serial port (which is
date, time, etc.).
physically the Bluetooth radio link). The read data are
stored in heap descriptors. The content of these heap
The system supports GPS Trackmaker's GTM format. descriptors are then fed to a DataSink object (see the
With GPS Trackmaker [10] one can easily plan a route, NMEA Parser). Furthermore this class is also responsible to
furthermore there are existing route planner homepages1 sign, if the status of the Bluetooth connection was changed.
which export the planned route in this format. GTM is
1 like http://terkep.t-online.hu
The NMEA Parser works as a DataSink, it receives and
concatenates incoming messages. If the whole message is
For on-device debugging purposes a logger class is also
received, than it tries to handle it as an NMEA-0183 applied. It contains one static function, which opens a file, message. If this process is completed, than it changes the writes numeric or text based data in the file with a internal, public variables according to the NMEA-0183 timestamp, and closes the file. In 3rd generation Symbian message. These internal variables are the longitude, devices on-device debug is possible. Logging is turned off latitude, date and time.
in the release version of NaviSpeech.
The Trip class represents a route. The route can be loaded
Furthermore for debugging purposes GPS emulation
from an own or from a GPS TrackMaker route description mode was also implemented. This means that the actual file (see subsection III./A. for more details). This coordinates are read from a file, and not from the GPS description file contains the waypoints in sequence. The receiver. With the help of the GPS emulation mode the class opens the GTM or NaviSpeech route file, it reads the features of the application can be easily tested without waypoints and loads the information of the waypoints moving around. (name, longitude, latitude) into internal, public variables. C. Memory Consumption
These variables are accessed later by the software to calculate the previous, current and next waypoints from the
Memory tests were carried out in order to define the
actual coordinates.
memory consumption of NaviSpeech. There are three steps
of memory usage: (1) loading the software into memory,
The Controller class is responsible for processing the (2) connecting to the GPS receiver, (3) using the TTS
user's interactions, such as pressing the keys, navigating in engine. Memory usage during all these steps is shown in the menu and selecting menu items. Also this class Figure 4. Although the database of the text-to-speech supervises and controls all the other classes. Furthermore engine is not loaded into memory3, the engine itself still
this class is responsible for the speech output of needs about 250 kBytes of the memory. The basic memory
NaviSpeech, thus the Speech User Interface (SUI) is also consumption of Navispeech (1) is about 280 kBytes at the
realized here. The speech enabled menus are realized with time of writing.
the help of the Avkon UI, it sends its own ‘menu events', like it sends keypress events to the Controller class.
The memory usage of NaviSpeech
The Container class creates and controls the Graphical
User Interface (GUI), including dialogs and menus.
ading out long text
Fig. 4. The memory consumption of NaviSpeech
D. GPS Receiver's Accuracy
According to our experience during outdoor usage the
accuracy of the GPS receiver is about 4-8 meters without any error correction algorithm. Sometimes the error can even be over 20 meters. With basic error correction algorithms (using a sliding window to avoid large
Fig. 3. The programming architecture of NaviSpeech
oscillations) these large errors can be detected and
The relations of the classes are shown in Figure 3. On the eliminated, and even in the lower range it increases the
Figure there are four main components. NaviSpeech is the accuracy. In the future implementation of additional error navigation system itself, the Symbian standard libraries are correction algorithms and the use of a differential GPS the built in Application Programming Interfaces (APIs) in receiver is also planned. the Symbian SDK, they contain the main classes and
functions to realize e.g. the Bluetooth connection, the audio E. Speech Synthesis in Mobile Devices
playback, etc. The Profivox TTS Engine is the speech synthesizer, which was also developed in the authors' lab.
It converts the input text into speech. The Phone hardware 3
is the physical mobile device, which can be accessed with
It is not a problem that the database is directly read from
the help of standard libraries only
the memory card as access times of both operative memory and storage are comparable in mobile phones.
Creating applications with speech output for mobile computers. Consequently if the usage of a system is not devices is a challenging task. Different platforms are straightforward and intuitive for blind users, then they can neither compatible on binary nor on source code level. easily loose their motivation of using the system, even if Therefore if an application should be released on different the functionality of it could make their life easier. platforms, many parts or the whole source code must be
For these reasons we created the speech user interface
rewritten. Unfortunately JAVA ME [11] supports only based on our previous experience from the SMSRapper limited features and its performance is also quite slow for application (it reads SMS messages aloud). The user can complex tasks, such as text-to-speech conversion. access the most important features with the keys of the Furthermore there is no standardized Speech Application mobile phone: Programming Interface (SAPI) [12] for these devices, the
text-to-speech engine must be rewritten or at least Key 0: The system's help menu. It tells the user the significantly modified for the different platforms. functions of the keys. Consequently if a 3rd party developer would like to
Key 1: Sets the previous waypoint as the actual waypoint.
complement his/her application with speech output, than
Key 2: Sets the nearest waypoint as the actual waypoint.
s/he has to purchase a TTS engine, which may highly
Key 3: Sets the next waypoint as the actual waypoint.
increase the costs of development. Furthermore lot of TTS
Key 4: Read sthe name of the actual waypoint.
engines support only one or a few basic languages,
Key 5: Reads the name of the next waypoint and the
consequently multilingual support can be hardly achieved.
distance from it.
Key 6: Reads the direction (if it is turned on as an option)
The performance and storage capacity of current and checks, if the direction is right. smartphones are suitable for speech synthesis, although the
Key 7: Reads the longitude and latitude.
latest technologies can be ported only with compromises.
Key 8: Reads the current UTC based time, which was
read from the GPS data.
Corpus based speech synthesizers [13] produce the best
Key 9: Reads the current date, which was read from the
quality nowadays, although the size of their database is GPS data. quite large (it can reach the gigabyte range), consequently a memory card is required. Even if the storage size is enough, Apart from the key commands NaviSpeech also has a it is not favorable to have such a large TTS as part of the speech enabled, multi-level menu system. From this menu main application (which is in the megabyte range).
system all the features can be reached. The features are
sorted into categories. The selected menu is always read,
Most of the recently used TTS engines in mobile and it can be selected by the left softkey of the mobile devices are concatenative ones. The size of diphone and device. The main menus are the following: triphone synthesizers' database is moderate (see Table 1.). Because of the reduced quality of mobile devices' speakers,
Route information: actual waypoint, next waypoint,
it is reasonable to use the 11 kHz / 8 bit databases instead checking the direction of better resolutions. Triphone databases produce better
Additional information: actual direction, time, date,
quality than diphone ones, but diphone synthesizers still longitude, latitude sound intelligible on mobile devices. The mobile version
Options: connect to the GPS device, turn compass on/off,
of the Festival TTS engine [14] (called Flite) uses a turn speech enabled menus on/off diphone based, 8 kHz English database, which is less than 4
More detailed information about speech generation in
IV. AN APPROACH FOR CREATING A
mobile devices can be found in [15].
MULTIPLATFORM SUI
Table 1. Size of the databases of the Hungarian TTS,
In order to make the design of the speech user interface
which was developed in the authors' laboratory.
easier an experimental approach was implemented. According to this approach the user interface (including
Databases' features
Storage size
both the graphical and the speech user interface) is created
11 kHz, A-law, 8 bit, diphone
from an XML description file. The main idea is shown in
11 kHz, A-law, triphone
(reduced set of triphones) 11 kHz, A-law, triphones
(including all CVC triphones)
F. Human Computer Interaction
The requirements of blind people are much higher than
that of the average user. Typically blind people are not computer specialists, they have never used screen reader applications and they are not enthusiastic about using
GPS system and the important aspects of blind user group were discussed. In Section III the architecture of the
3rd Party Application
proposed system was introduced; the difficulties of creating speech synthesis in mobile devices were investigated; and the speech user interface of the system was described. At
Different Modalities
the end of the paper future plans were briefly shown.
Main Module
XML Interpreter ,
According to the authors' knowledge it is the first mobile
Processor and Interface
(Dinamically Linked Library )
device based navigation system for blind people available
in Hungarian. If the features, which were briefly introduced
in section IV., will be implemented, the beta version of the
Fig. 5. The architecture of the XML based multimodal interface
software will be released. The authors are looking for
possible research and/or industrial partners.
The 3rd party application uses the main module as a
dynamically linked library (DLL). This DLL interprets the
XML user interface description file; creates and supervises the speech and graphical user interfaces.
The research presented in the paper was partly supported
by the Hungarian National Office for Research and
In the case of Navispeech the graphical user interface Technology (NAP project no. OMFB-00736/2005 and
(GUI) would include only the menus, and the speech user GVOP project no. 3.1.1-2004–05–0485/3.0) interface would include menus, navigation and additional
information as it was introduced above.
[1.] Nuance Talks, Available:
More information about the XML based user interface
description can be found in [16].
[2.] Tiresias.org – Scientific and Technological Reports
Mobility of Blind and Elderly People Interacting with Computers, Available: http://www.tiresias.org/reports/mobicf.htm
[3.] A. Helal, S. Moore, and B. Ramachandran: "Drishti: An Integrated
Navigation System for Visually Impaired and Disabled,"Proceedings
The features of NaviSspeech are iteratively discussed
of the 5th International Symposium on Wearable Computer, October
with the target user group. Blind users found that the
2001, Zurich, Switzerland
system's features are very favorable and they suggested [4.] JAWS for Windows, Available:
additional options to be implemented in the future.
[5.] Tugs website, Available:
They recommended that the route planner should be [6.] StreetTalk (TM) GPS Solution for PAC Mate, Available:
available from NaviSpeech itself. The route planner home
page4 is not accessible for blind users. It would also be nice [7.] L. Ran, A. Helal and S. Moore: "Drishti: An Integrated
Indoor/Outdoor Blind Navigation System and Service", Proceedings
to exclude the personal computer from the route planning
of the 2nd IEEE Pervasive Computing Conference, Orlando, Florida,
process. In conclusion NaviSpeech should contain speech
enabled user forms, where blind users can enter the [8.] P. Narasimhan: Trinetra: "Assistive Technologies for Grocery beginning and the end waypoints, this information should
Shopping for the Blind", IEEE-BAIS Symposium on Research in
be sent to the route planner homepage, and the results (the
Assistive Technologies, Dayton, OH, April 2007.
[9.] J. Tölgyesi: "Development of a Speech Based Mobile Navigation
final route in a GTM file) should be downloaded from the
System" (in Hungarian, Beszéd alapú mobil navigációs rendszer
fejlesztése), Budapest University of Technology and Economics,
Master Thesis, 2006.
Another request is to be able to record routes. It means, [10.] GPS Trackmaker, Available:
http://www.gpstm.com
that a blind user walks through a route with an additional [11.] R. Riggs, A. Taivalsaari, J. V. Peursem, J. Huopaniemi, M. Patel, A.
person, and during this walk, all the GPS information are
Uotila, Programming Wireless Devices with the Java 2 Platform
recorded by NaviSpeech and stored on the mobile device.
Micro Edition, Addison-Wesley, 2003, 434 p.
Consequently the system could help the blind user next [12.] Parmod, G., MS SAPI 5 Developer's Guide, InSync Software Inc.,
time, and there will be no need of the additional person.
[13.] B. Möbius, "Corpus-based speech synthesis: methods and
challenges", Arbeitspapiere des Instituts für Maschinelle
Certainly there are more features to be implemented, like
Sprachverarbeitung (Univ. Stuttgart), 2000, AIMS 6 (4), pp. 87-116.
algorithmically increasing the accuracy of the GPS and [14.] Black, A., Lenzo, K., "Flite: a small fast run-time synthesis engine",
4th ISCA Speech Synthesis Workshop, Scotland, 2001, pp. 157-162
using a Point Of Interest (POI) database to be able to tell [15.] B. Tóth, G. Németh: Challanges of Creating Multimodal Interfaces
more about the environment.
on Mobile Devices, Proc. of 49th International Symposium ELMAR-
2007 focused on Mobile Multimedia, 12-14 Sep. 2007, Zadar,
Croatia, pp. 171-174
[16.] B. Tóth, G. Németh, "Creating XML Based Scalable Multimodal
In the current paper a possible approach of creating a
Interfaces for Mobile Devices", 16th IST Mobile and Wireless
speech enabled navigation system for blind people was
Communications Summit, Budapest, Hungary, July 2007
shown. First, existing systems and the specialties of the
4 http://terkep.t-online.hu
Source: http://smartlab.tmit.bme.hu/tothbalint/downloads/tothb-2007-rceas.pdf
Hepatitis C in CHildren aparna roy, MD, Mph and Kathleen Schwarz, MD IntroductionHepatitis C virus (HCV) infection of the liver can occur during childhood and creates many problems and concerns for both the infected children and their families. The purpose of this chapter is to discuss: y ways that children contract hepatitis C y the type of liver damage that can occur
Be on the lookout for anaplasmosis in cattle Carla L. Huston, DVM, PhD, ACVPM Dept. of Pathobiology and Population Medicine Mississippi State University College of Veterinary Medicine Submitted to Cattle Business Magazine, Sept. 2013 For most of the southeastern US, this has not been an unusually hot or dry summer;