Transcriber architecture
- Transcriber directory tree
- MacOS binary package
Global variables
Global variables usage
A global variable is a variable available in all the Tcl
code. In Transcriber
, there is only 1 global variable which the array v
. If you want to use one of its values, you just have to type:
global v
in the procedure where you are and before the tcl code where you use it.
Example:
proc tryit {
global
set v(my_stuff) 3
}
Adding a global variable
If you want to add a new element in the array v
, which contains all the global variables, you just have to type in a Tcl
script:
set v(my_ element) value_element
But, if you want it to be saved when you exit Transcriber
, you have to add the following line in etc/default.txt
:
v(my_element) ""
So, when Transcriber
will be launched, it will read that line and create the global variable v(my_element)
with nothing inside.
If during Transcriber
execution, a value is set in the global variable, then when exiting, that value will be saved in the user configuration file (.transcriber
on Linux or transcriber.pref
on Windows or Transcriber Configuration
on Mac), and the next time Transcriber
is launched, that value will be set in v
(it means that it will have a priority on what is defined in etc/default.txt
).
List of global variables
Meaning of global variables | Description |
autosave,name | name under which current transcription is auto-saved |
autosave,next | flag on if autosave handler is registred |
autosave,time | time, in minutes, before autosaving after a modif (0:disabled) |
backup,ext | extension for backup (default to ~) |
bgPos,chosen | chosen position for background selection |
bindings | pairs of key/inserted string |
color,bg | background color |
color,bg-back | background color for background noise |
color,bg-evnt | background color for events |
color,bg-sect | background color for section |
color,bg-sel | background color for selected signal |
color,bg-sync | background color for synchro |
color,bg-text | background color for text |
color,bg-turn | background color for turns |
color,fg-back | foreground color for background noise |
color,fg-evnt | foreground color for events |
color,fg-sect | foreground color for section |
color,fg-sync | foreground color for synchro |
color,fg-text | foreground color for text |
color,fg-turn | foreground color for turns |
color,hi-sync | current synchro color |
color,hi-text | current text color |
convert_events | convert strings [i] to events for old .xml files |
curs,event | next event for cursor move |
curs,fast | callback for fast fwd/bwd auto repeat |
curs,max | maximal cursor position during play (end of signal or sel.) |
curs,min | start of play for repeat (begin of signal or selection) |
curs,pos | current position of cursor in signal |
curs,start | playback start time |
debug | flag for debug menu display |
demo | switch to demonstration mode |
encoding | if a different encoding is to be used |
encodingList | list of IANA encoding names/usual names |
ext,lbl | list of extensions for importable label files |
ext,snd | list of known extensions for sound files |
ext,trs | list of extensions for importable transcription files |
file,default | default configuration file |
file,dtd | DTD file for transcriptions in XML format |
file,local | user localization file |
file,user | user configuration file |
find,case | case sensitiveness for find ("-nocase" or "") |
find,direction | search direction for find ("-forward" or "-backward") |
find,mode | mode for find ("-exact" or "-regexp") |
find,replace | replacement string |
find,what | string to look for |
font,axis | font used for axis |
font,event | font used for events |
font,info | font used for infos |
font,list | font used for fixed length lists |
font,mesg | font used for messages |
font,text | font used for text editor |
font,trans | font used for transcriptions in segments |
geom,$w | default geometry for window $w |
glossary | value/comment word pairs of user glossary |
img,$name | bitmap image |
keepconfig | ask to save configuration before leaving |
lang | language for menus ("fr" for french, default to english) |
language | list of pairs iso639-code/language-name for localization |
multiwav,file | stores the current MultiWav menu file selection |
multiwav,files | list of all the files in the MultiWav menu |
multiwav,path | list of the full pathnames of the MultiWav menu files |
newtypes | list of supported import formats with description |
options,file | default file for user configuration |
options,list | values to be saved in user configuration |
path,base | base directory of Transcriber |
path,doc | directory for help files |
path,etc | path for default config values and DTD |
path,image | directory for GIF or bitmap images |
path,shape | default directory for centi-second sound shapes |
path,sounds | last directory used for sound files selection |
path,tcl | directory for Tcl scripts |
play,after | callback after sound playback is over |
play,auto | automatic play new selection or signal (1 or 0) |
play,no-fast | temporary inhibition of fast forward/backward |
play,state | currently playing or not |
playbackBeep | beep sound file |
playbackBefore | go back before playing |
playbackMode | continuous/pause/beep/stop/loop playback mode |
playbackPause | pause duration between segments |
playbackSegmt | set if playing a single segment |
playbackSpeed | speed playback factor (unsupported) |
preferedPos | cursor insertion pos in text editor (start/end of line) |
proc,id | id for numbering of socket connections to file server |
scribe,name | default transcriber's name |
segmt,curr | id of current segment |
segmt,move | id of segment whose boundary is currently being moved |
sel,begin | begin of selected area of signal |
sel,end | end of selected area of signal |
sel,event | next event for automatic extension of selection |
sel,start | position of initial click for selection |
sel,text | text describing selection limits |
shape,bg | request shape calculation in background |
shape,cmd | sound command containing shape of signal |
shape,min | minimal duration for shape request (else max for display) |
shape,wanted | if user wants shape calculation |
sig,base | header size for raw files |
sig,channels | channels for raw audio files |
sig,cmd | sound command for signal access |
sig,desc | variable containing signal description to be displayed |
sig,gain | scale tk widget for volume gain change |
sig,header | raw sound file header size |
sig,len | length of signal (in seconds) |
sig,max | = sig,min + sig,len |
sig,min | beginning of signal (should be 0) |
sig,name | file name of audio signal |
sig,open | flag to see if an audio file has been opened |
sig,port | socket port for audio file server |
sig,rate | sound rate for raw audio files |
sig,remote | access to files through audio file server or not |
sig,server | audio file server |
sig,shortname | short file name of audio signal |
space,auto | automatic space insertion |
spell,* | related to spell checker |
tk,dontmove | flag to freeze once the cursor update inside text widget |
tk,edit | text tk widget |
tk,play | button tk widget for play |
tk,stop | button tk widget for stop |
tk,wavfm | main waveform tk widget |
trace,* | related to performance monitoring |
trans,desc | description of transcription for info window |
trans,format | file format of the transcription |
trans,list | ordered list of tags for segments in text widget |
trans,modif | flag "transcription modified" |
trans,name | file name of transcription |
trans,path | default path for open/save transcription dialog boxes |
trans,root | id of transcription root tag |
trans,saved | flag if transcription has been saved at least once |
trans,seg? | list of transcription segments at level ? |
type,chosen | section type chosen in dialog or menu |
undo,list | infos for undo |
undo,redo | flag on if undo is in fact redo |
var,msg | variable for selection infos and other messages |
view,$win | flag for frame/window display |
$wav,height | height of waveform widget (in pixels) |
$wav,left | left position of window in signal (in sec) |
$wav,resolution | initial resolution for signal |
$wav,right | = $wav,left + $wav,size |
$wav,scale | scrollbar tk widget for scale change |
$wav,scroll | scrollbar tk widget for horizontal move |
$wav,size | length of window |
$wav,sync | list of tk widgets to be synchronized |
wavfm,list | list of all waveform views |
zoom,list | infos for unzoom |
Transcriber directory tree
Source package
Directory name | Content |
arabic\ | Arabic support patch |
convert\ | Tcl script modules used for format conversion |
debian\ | Goodies to build the Transcriber debian package |
demo\ | Sound and transcription demo files (wav and trs) |
doc\ |
HTML documentation available from the help menu
of Transcriber
|
etc\ | Default configuration file, DTD, localization file |
src\ | Sources for new Tcl commands and Tk widgets |
tcl\ | Tcl scripts |
themes\ | Transcriber theme management |
Linux package
If Transcriber
is installed from the sources in your home directory ( ~
), Transcriber
files will be installed in the directories ~/lib/
and ~/bin/
with the following structure:
Directory name | Sub directory or file name | Content | |
lib\ | |||
snack2.2\ |
|
Multiplatform audio driver library coded in C | |
tcLex1.2\ |
|
Flex-like parsing extension for Tcl used to create a TRS parser based on a DTD file and manage the list of speakers in a TRS file | |
transcriber1.5\ | |||
arabic\ | Arabic support patch | ||
convert\ | Tcl script modules used for format conversion | ||
demo\ | sound and transcription demo files (wav and trs) | ||
doc\ |
HTML documentation available from the help menu
of Transcriber
|
||
etc\ | default configuration file, DTD, localization file | ||
tcl\ | Tcl scripts | ||
themes\ | Transcriber theme management | ||
libtrans.so | Dynamic shared library | ||
pkgIndex.tcl | Index of the Tcl commands available in the library | ||
bin\ | trans |
sh script
uses to launch Transcriber
|
|
transar |
sh script
uses to launch Transcriber in arabic support mode
|
Windows package
Directory name or file name | Content | ||
lib\ | |||
snack2210\ |
|
Multiplatform audio driver library coded in C | |
tcLex12a1\ |
|
Flex-like parsing extension for Tcl used to create a TRS parser based on a DTD file and manage the list of speakers in a TRS file | |
transcriber1.5\ | |||
convert\ | Tcl script modules used for format conversion | ||
demo\ | Demo files (wav and trs) | ||
doc\ |
HTML documentation
available from the help menu of Transcriber
|
||
etc\ | default configuration file, DTD, localization file | ||
tcl\ | Tcl scripts | ||
themes\ | Transcriber theme management | ||
pkgIndex.tcl | Index of the Tcl commands available in the library | ||
README | README file | ||
libtrans.dll | Dynamic shared library | ||
treectrl2.2\ |
|
File explorer Tk widget | |
trs.ico |
|
|
Transcriber icon
|
gpl.txt |
|
|
GNU General Public License |
unins000.exe |
|
|
Uninstall executable |
transwin.exe |
|
|
Transcriber launcher
|
tclkit-win32.exe |
|
|
Executable file containing Tcl and Tk |
C code
A part of Transcriber
code has developped in C
to speed up the sound widget computations and usage.
C functions
C function | Tcl command | C files | Comment |
AxisCmd | axis | axic.c, trans.h | Tk Widget in C for time axis |
SegmtCmd | segmt | segmt.c, trans.h | Tk Widget in C for segmentation |
WavfmCmd | wavfm | wavfm.c, trans.h | Tk Widget in C for waveform display |
Trans_Init |
|
trans.c, trans.h |
Initialize library libtrans and create the
Tcl commands: axis , segmt and wavfm.
It is called in case of a package require trans |
Trans_SafeInit |
|
trans.c, trans.h |
Same that Trans_Init but in safe mode
|
C functions usage
Once the C
code compiled, the C
functions are available via the
shared library libtrans.so
on Linux or libtrans.dll
on Windows.
To use those Tcl commands in a Tcl script, it is necessary to
require, in Tcl, the trans
package:
package require trans 1.5
That command will source the package index pkgIndex.tcl
that
instructs the trans
package loading mechanism.
C source files
All the C
source files are located in src\
directory.
File names | description |
axis.c |
C code for axis Tk widget
|
segmt.c |
C code for segmt Tk widget at the signal level.
|
shape.c |
Compute the global shape of the signal by using
Snack sound sub-command:
$snd centi
$snd shape
$snd get
$snd order
|
trans.c |
Main body of trans package exporting commands
AxisCmd , SegmtCmd and WavfmCmd .
|
trans.h |
Header of the C library.
It defines as external the following functions:
AxisCmd ;
SegmtCmd ;
WavfmCmd ;
Trans_Init ;
Trans_SafeInit ; |
wavfm.c |
C code for wavfm Tk widget
|
Tcl scripts
File names | Description |
About.tcl | Display embedded help files. Tries to view it in the default browser (Mozilla, Firefox or Internet Explorer). |
BgShape.tcl | Compute the shape of a signal. Script launched as a background sub-process by Transcriber when background shape calculation mode is on. |
ComputeShape.tcl | Stand-alone script for pre-computing a set of signal shapes (see comments in the code for the command line options) |
Debug.tcl | Very rough debugger window which can be activated when debug menu in general options is on. |
Dialog.tcl |
Some generic functions for management of the user
interface, e.g.:
|
Edit.tcl |
Management of the text editor pane.
|
Episode.tcl | Management of global properties of the transcription edited in "File/Edit Episode attributes..." |
Events.tcl |
Management of:
|
Interface.tcl |
Management of the user interface.
|
Main.tcl |
The main script part loads all needed libraries
and other script parts, reads configuration, parses the command
line.
|
Menu.tcl |
This script enables the menu management with:
|
MultiWav.tcl | Code intended for meeting recordings management. |
Play.tcl |
Management of various playback modes
|
Segmt.tcl |
Within Transcriber , a "segmentation"
designs one layer of the transcription (i.e., sections, turns,
synchros, background conditions) and also the associated
segmentation widget displayed under the signal.
|
Signal.tcl |
This script enables the signal management with:
|
SoundServer.tcl | Stand-alone script to be used for providing access to remote sound files on a server, (to be configured for the application, see the code). |
Speaker.tcl |
Speakers and turns management of each TRS file.
|
Spelling.tcl | Spell checking with Aspell if available |
Synchro.tcl | Management of the list of breakpoint times |
Topic.tcl | Topic and sections management, very similar to Speaker.tcl |
Trans.tcl |
Management of transcription I/O and display
|
Undo.tcl |
Managemenent of:
|
Waveform.tcl |
This script manages the waveform by enabling to:
|
Xml.tcl | Generic-purpose XML library. Loads and initialize XmlItem , XmlDtd ans XmlParse |
XmlDtd.tcl | Management of an XML DTD . |
XmlItem.tcl | Management of XML tags and data in Tcl in an object-oriented way. |
XmlParse.tcl | Parsing of an XML document in Tcl using the tcLex library. |
Configuration files
List of Transcriber
configuration files:
File name | Directory | Description | |
beep.au |
etc/
|
Trancriber beep sound | |
default.txt |
etc/
|
Default configuration file of Transcriber .
When Transcriber is launched, it parses
default.txt . And for each line of the file that defines a
parameter, it sets the global v variable.
For example, if the line is
shape,wanted -1 set v(shape,wanted) -1 ~/.transcriber . And only the
variables defined in default.txt will be saved in ~/.transcriber
|
|
local_cs.txt |
etc/
|
Tcl array of all the Transcriber messages
translated in czech
|
|
local_fr.txt |
etc/
|
Tcl array of all the Transcriber messages
translated in french
|
|
local.txt |
etc/
|
Not used any more. Kept empty for backward compatibility | |
trans-13.dtd |
etc/
|
DTD file defining the XML structure of the TRS files outputted by Transcriber
|
|
trans-cha.dtd |
etc/ |
DTD file adapated from trans-13.dtd . The purpose of this DTD is to make
Transcriber-1.4.1 working for Childes format
|
|
.transcriber | Your home directory | User preferences file on Linux. |
When Transcriber is launched, it reads the user configuration file and
the default configuration file (default.txt ), and in case of
conflict it gives priority to what it set in the user
configuration file.
|
transcriber.pref |
Your sub-directory in the directory C:\Documents and Setting
|
User preferences file on Windows. | |
Transcriber Configuration |
~/Library/Preferences
|
User preferences file on Mac |
I/O filters and formats
All I/O filter are located in the convert/
directory.
If you want to create a new filter check convert/README
.
Format | Input / Output> | Extension | Filename | Comment |
CHA | In | .cha | cha.tcl |
Filter to .cha (Childes/CHAT) format provided by
Zhibiao Wu from LDC. Support for this format is yet experimental.
When reading a .cha file, the tool switches to a chat mode with new attributes available in the interface.
It switches back to the initial mode when creating a new transcription or reading a file in any non-Chat format. |
CTM | In | .ctm | ctm.tcl | NIST file transcription format (including RT'03 format extensions and multi-level display as label) |
ESPS / xwaves | In | .lab | xwaves.tcl |
Esps xwaves
is an environment for the analysis of speech data that stored
label data information in ASCII . It is connected to an HMM
toolkit called HTK . The software has been bought by Microsoft and
given to KTH and the toolkit source code can be downloaded in its
website.
|
HTML | Out | .html | html.tcl |
Export in HTML format
|
hub4e96 | In | .txt .sgml | hub4e96.tcl | Data format of the HUB4 english corpus produced by LDC in 1996 and 1997. |
LIMSI | In/Out | .lbl | lbl.tcl |
The Lbl
files are ASCII transcription files in which:- each line represents a segment; - each line begins with the begin time of the segment and is followed by the transcription which may contain enriched tags Ex: 0.0 [b] 0.217 France-Inter il est 19 heure, le journal, Christophe Hondelatte 3.451 [musique] 8.246 [musique]+bonsoir 8.833 [jingle] |
LDC | In/Out | .typ | typ.tcl |
Format created by the LDC. This is the native format of the first
Transcriber versions.
|
MDTM | In | .mdtm | mdtm.tcl |
MDTM is a
NIST segmentation format. But in Transcriber the filter
implemented just extracts the speakers information. It is
compliant with the NIST RT'03 specification.
|
OGI lola | In | .lola | lola.tcl |
The lola files are ASCII "location and label" files. They
are similar to the .phn files of the TIMIT database
except:
Each file in this distribution has the header: MillisecondsPerFrame:
3.0 After that, are a series of lines, one per segment, of the form: [begin
frame][end frame + 1] label For example 200
237 ah The [ah] segment extends from 200 to frame 236 inclusive. The end
label is 237 for historical reasons.
|
SCLITE | In | .sgml | sclite.tcl |
Filter
developed in the framework of the European project CORETEX.
It takes as input a .sgml file generated by Sclite, the NIST Speech Recognition
Scoring Toolkit.
Basically, this file contains the result of the alignment between a reference .stm transcription and an hypothesized
.ctm automatic word transcription.
When such a file is opened, Transcriber displays it in a friendly way
to underscore matches and differences (read
transcriber/convert/sclite.tcl to have more information)
|
SDT | In | .sdt | sdt.tcl | Acoustic segmentation format defined by NIST in 2000 for TREC-9 SDR. Its spefication can be found on NIST website |
STM | In/Out | .stm | stm.tcl | NIST transcription file format. It is used by sclite as a referent, to evaluate automatic transcription. |
TIMIT | In | .phn .wrd .txt | timit.tcl | Format used in the TIMIT database, a database of speech created in 1989 by Texas Instruments and the Massachusetts Institute of Technology. The TIMIT Corpus consists of 630 speakers reading a list of 10 phonetically-rich sentences (selected from a larger set). |
TEXT | Out | .txt | text.tcl |
Filter to export in text format:speaker1: transcript1
|