(List organised by order of presentation)


Can close analysis of gesture and body language help us to understand the language comprehension and Theory of Mind skills of children with Down syndrome?

Lucy Dix, University of Leeds 

Children with Down syndrome are likely to have some speech and language difficulties, arising from a range of cognitive and physical difficulties. A poor phonological loop system, difficulties with short term memory, a small oral cavity and weak oral motor muscles all contribute to make verbal communication difficult for those with Down syndrome.

Our study uses a range of tasks which require no verbal response, to assess the language comprehension, working memory and theory of mind skills of 40 children aged between 2 and 10. Close analysis of the gestural data could markedly change how many participants are seen to pass each task; we will examine in detail a range of gestural and physical responses to the tasks to describe how participants are responding to key questions.

By using a mixed methods approach to our data collection and analysis we will be able to support our quantitative interpretations with rich qualitative descriptions. At this stage in our research we will be able to present some initial analysis for review and comment.

Popular Culture in the Literacy Classroom: Paying Attention to Modes

Rebecca Parry, University of Leeds

Marsh and Millard’s (2000) influential work, ‘Literacy and Popular Culture’ is underpinned by the argument that the ‘cultural capital’ (Bourdieu, 1977) of all children should be recognised, to ensure that their learning experiences are made meaningful. By valuing popular culture, teachers can ensure that children do not have to reject their home identities in the classroom. In doing so, connections can be made with the interests of children so that their literacy learning will be more motivated. In the many examples of research focused on literacy and popular culture, it is evident that it is not just the sheer pleasure of sharing popular culture texts which increases children’s motivation (as important as this is). Children are not simply enthused when their lived cultures are valued, but more fundamentally they are motivated because they can participate in (and are not excluded from) the learning that is constructed. This is especially the case when children undertake multimodal analysis of popular culture texts and begin to make explicit, their implicit understandings of multimodal texts.

In this oral presentationI share children’s understandings of narrative as expressed in their multimodal text productions to provide examples of this learning process. I demonstrate the need for interrogatory pedagogic strategies (Wells, 2007), including practical productions, in order to enable children to meaningfully pay attention to the distinct and combinatory affordances of multimodal texts.           


Bourdieu, P. (1977) Outline of a Theory of Practice. Cambridge, Cambridge University Press

Marsh, J and Millard, E. (2000) Literacy and Popular Culture: Using Children’s Culture in the Classroom. London, Paul Chapman publishing

Wells, G. (2007) Semiotic mediation, dialogue and the construction of knowledge. Human Development. Vol. 50 No. 5 p. 244-274

Capturing digitally-mediated literacy practices in classrooms

Ibrar Bhatt, University of Leeds

This paper outlines a guiding framework and multimodal methodology for capturing student writing activities as they unfold in real time. Building upon current advances in video analysis, Literacy Studies, and Multimodality, I capture the entire procedure of on-screen composition as a digital recording (screencast), alongside an embedded video recording of writers’ movements, and an audio recording of their vocalisations around the creation of their course assignments. This provides me with a detailed rendition of human and non-human actors’ interactions, on- and off-screen, which are then transcribed in a format amenable to analysis. Transcripts are subsequently created using relevant software (ELAN), whilst also supported by ethnographic observations, interviews, and collected student work. Drawing upon reflections from a doctoral study, I discuss methodological implications to the collection, transcription, and presentation of such data and discuss implications for established methodological practices.

Such a methodology has the potential to reveal a composite picture of a researched site and a detailed and dynamic rendition of digital literacy activities. The analytic methodology proposed, therefore, is useful for researchers paying greater attention to the ecologies in which digital literacies occur, the choreography and co-ordination of practices during the writing of an assignment, and how and why some practices end up elided from view.

To gesture or not to gesture. Investigating the composition of multimodal communicative acts and their utility during a map task

Jack Wilson, University of Leeds

In everyday situations people use language, not simply as a means of conveying information, but in order to coordinate during joint activities (e.g., giving directions to the York Minster) which are composed of joint projects (e.g., coordinating on the location of a landmark) (Clark, 1996). From this perspective, an individual’s utterances may be thought of, not just as an exchange of information, but as contributions towards the joint goals of those involved in the interaction. Such contributions are not merely vocal, but are multimodal, frequently containing gestures that contribute, non-redundantly, to the overall meaning of an utterance. This paper focuses on such gestures.

Studies focussing on the effect of gestures within contributions have demonstrated that they play an important role in helping interactants reach a mutual understanding (Bavelas et al., 2011), direct joint attention (Bangerter and Chevalley, 2007), and accommodate their audience (Özyürek, 2002). Furthermore, it has been demonstrated that speakers, when given a choice, intentionally provide important semantic information through gesture (Melinger and Levelt, 2004).

This paper investigates the use of gesture during a dyadic route-drawing task, which has as its measurable output a map drawn by one participant under the guidance of the other. Specifically, this paper qualitatively investigates two communicative strategies: one in which the participants frequently produce gestures and another in which the participants tend to communicate with an emphasis on speech. Speech and gesture are analysed using Talmy’s (2000) semantics of space by focussing on the spatial/visual properties of the maps and exploring how these properties are represented in participants’ contributions within the task. In doing so, it is shown that (1) gestures and speech can be closely linked forming a single communicative unit; (2) gesturing seems to take less time; and (3) gesturing, when successful, seems to provide participants with a critical medium through which they can reach their communicative goals.


Bangerter, A. and Chevalley, E. (2007). Pointing and describing in referential communication: When are pointing gestures used to communicate? In MOG 2007 Workshop on Multimodal Output Generation, pages 17–28.

Bavelas, J., Gerwing, G., Allison, M., and Sutton, C. (2011). Dyadic evidence for grounding with abstract deictic gestures. In Stam, G. and Ishino, M., editors, Integrating Gestures: The Interdisciplinary Nature of Gesture, volume 4 of Gesture Studies, pages 49–60. John Benjamins Publishing Company, Amsterdam.

Clark, H. H. (1996). Using language. Cambridge University Press, Cambridge.

Melinger, A. and Levelt, W. J. (2004). Gesture and the communicative intention of the speaker. Gesture, 4(2):119–141.

Özyürek, A. (2002). Do speakers design their co-speech gestures for their addressees? the effects of addressee location on representational gestures. Journal of Memory and Language, 46(4):688–704.

Talmy, L. (2000). Toward a cognitive semantics, vol. I: Concept structuring systems. MIT Press, Cambridge, MA.

A multi-modal training approach to improve cochlear implant users’ ability to handle simultaneous talk

Amy V. Beeston, University of Sheffield; Emina Kurtić, University of Sheffield; Erica Bradley, University of Sheffield; Harriet Crook, University of Sheffield; Guy Brown, University of Sheffield; Bill Wells, University of Sheffield

Simultaneous talk is problematic for individuals who use a cochlear implant (CI). Until recently, even in one-to-one settings many users need optimum conditions to hold a satisfactory conversation, e.g. a quiet environment, and the communication awareness of both participants that they should avoid talking at the same time. Simultaneous speech is surprisingly frequent in social settings, however, occupying 16% of the total talking time and affecting 41% of speaker turns in our corpus of 4 hours of informal British English conversation (Kurtić et al. 2012). Nonetheless, there has been no evidence to guide professionals or users on how they might deal with simultaneous talk.

Recent improvements in CI signal processing strategies mean that it is now more realistic for users to attempt to engage in simultaneous talk. In addition, we believe that multi-modal training may increase CI listeners’ conversational confidence. In this paper we report early development of an audio-visual environment for listening practice, in which audio tracks may be separated and recombined to allow listening to conversation with increasing degrees of overlap. Moreover, we explore strategies to translate audio-based cues that normal-hearing people rely upon into visual markers, in order to assist CI listeners’ understanding of simultaneous talk.


Kurtić, E., Wells, B., Brown, G. J., Kempton, T., & Aker, A. (2012). A corpus of spontaneous multi-party conversation in Bosnian Serbo-Croatian and British English. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), 21-27 May 2012, Istanbul.

Alternative Ways of Looking

Caroline Bligh, Leeds Metropolitan University

This paper explores the application of an alternative methodological tool which was applied within prior Doctoral study to examine the ‘silent’ experiences of a small number of young bilingual learners’ during the emergent stage of English language acquisition – the silent period.The study adopted a multi-method ethnographic approach to data gathering, including Flewitt’s (2005) ‘gaze following’,as an alternative means through which to identify silent participation within an early years setting. Interest was drawn towards comprehending ‘eye movements’, ‘gesture’ and ‘facial expressions’ and also the child as a ‘spectator’ (Saville-Troike, 1988). Recent studies (Flom, Lee and Muir, 2006) within the arena of early childhood identified the significance of gaze as a multimodal means of communication in pre-school children.Not only did Lancaster (2001) inform on the functions of gaze in young children’s interpretations of symbolic forms; but Flewitt’s (2005) sociocultural research also identified that many children in early years settings used new and additional ‘voices’ (Edwards et al., 1998) as a means to communicate – including gaze direction.Following gaze direction was therefore identified by the researcher as a significant additional methodological tool – a refined method of radically looking; chosen as a means to examine silent experiences.

Phonological ‘Wildness’ in Early Language Development: An Eye-Tracking Study

Catherine Laing, University of York

Onomatopoeia (OWs) are often disregarded from the phonological analysis of infant data, seen as a temporary aspect of the developing lexicon. However, OWs often constitute an important portion of infants’ earliest word forms, and are thought here to provide a scaffold in language development through the perception of phonologically ‘wild’ segments (Rhodes, 1994), whereby the vocal tract’s full capacity is used in order to approximate non-human sounds, hypothesised here to serve as an attention-marker in infant-directed speech (IDS).

The present study compares 14-month-old infants’ reactions to ‘wild’ (W) forms versus ‘tame’ forms (produced within the phonological conventions of the target language, T) in the input. Eye-tracking was used to analyse infants’ perception of these forms. Infants were tested using OWs presented in either a W or T manner. Paired images of familiar animals were presented, and an OW matching one of the animals was played over an audio device. Looking times were measured to determine which forms were most recognisable to the infants.

Results show that wildness is peripheral to infants’ perception of onomatopoeia, while reduplication plays a significant role in OW perception. This is supported by evidence from IDS showing extensive use of reduplication in the production of OWs.

Figurative language processing: Differential contributions of the cerebral hemispheres

Ekaterini Klepousniotou, University of Leeds

Communication is filled with ambiguity. We often use polite, indirect forms to make requests; humor and jokes frequently hinge on some secondary or unexpected meaning or feature of a word or phrase; we may have to revise an initial inference; and so on. In normal comprehension, it is common to mentally activate more than one meaning of words and phrases, often without awareness or intention. In fact, it has been suggested that in one minute of oral speech we use on average 4 known metaphorical expressions and 2 novel ones (Pollio et al., 1977).

One of the most prominent contributions of the right hemisphere (RH) to language processing seems to be in the appreciation of lexical-semantic relations. A series of studies will be presented that elucidate the role of the RH in lexical ambiguity processing in general and metaphor processing in particular, suggesting that secondary or subordinate, non-literal meanings are less available when the RH is dysfunctional.

Quran Recitation, Natural Language Processing, and Arabic and Islamic Studies

Claire Brierley, University of Leeds; Majdi Sawalha, University of Jordan; Eric Atwell, University of Leeds; James Dickins, University of Leeds

“Natural Language Processing Working Together with Arabic and Islamic Studies” is a 2-year EPSRC-funded project to study Quran recitation, linking sound, text, and linguistic annotations. Tajwid or correct Quranic recitation is very important in Islam.

The original insight informing this project is to view Tajwid mark-up in the Quran as additional text-based data for computational analysis. This mark-up is already incorporated into Quranic Arabic script, and identifies prosodic-syntactic phrase boundaries of different strengths, plus gradations of prosodic and semantic salience through colour-coded highlighting of pitch accented syllables, and hence prosodically and semantically salient words.

We have developed software for generating a phonetically-transcribed, stressed and syllabified version of the entire text of the Quran, using the International Phonetic Alphabet (IPA). This canonical pronunciation tier for Classical Arabic is informed and evaluated by Arabic linguists, Tajwid scholars, and phoneticians, and published in an open-source Boundary-Annotated Quran Corpus. We utilise statistical techniques such as keyword extraction to explore semiotic relationships between sound and meaning in the Quran, invoking a Saussurean-type view of the sign as ‘…a bi-unity of expression and content…’. Our investigation entails: (i) text data mining for statistically significant phonemes, syllables, words, and correlates of rhythmic juncture; and (ii) interpretation of results from interdisciplinary perspectives: Corpus Linguistics; Tajwid science; Arabic linguistics; and Phonetics and Phonology.


EPSRC. 2013. Natural Language Processing Working Together with Arabicand Islamic Studies.


Brierley, C., Sawalha, M., Atwell, E. 2012. Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing. Proceedings of Language Resources and Evaluation Conference (LREC) 2012, Istanbul

Sawalha, M., Brierley, C., Atwell, E. 2012. Predicting Phrase Breaks in Classical and Modern Standard Arabic Text. Proceedings of LREC 2012, Istanbul.

Multimodality in Arabic Virtual Learning Environments

Ahmed Alzahrani, University of Leeds and Eric Atwell, University of Leeds

Our research focus is a survey of current use of Virtual Learning Environments – VLEs – in Saudi Arabian University teaching, and recommendations for the future. This paper reports our findings on the use of VLE multimodal features in language and linguistics teaching in Saudi higher education. VLEs offer mixed use of text, sound, video resources in learning modules. We will present a range of examples of these multimodal learning resources from Saudi VLEs as well as some of their features. For instance, these resources (texts, sounds and videos) can be downloaded and re-used. They are, in addition, of a reasonable length which makes them appropriate for use in language and linguistics teaching. As disadvantages of these resources, most videos have no subtitles, most audios do not have transcripts, and both were not professionally recorded, which decreases their re-usability. The current state of VLE systems in Saudi Arabian universities shows that they are suitable for language and linguistics teaching while taking into consideration that they still need more improvements to be more effective.

The modality question in Arabic dialect corpus linguistics: how should we capture dialect?

Eric Atwell, University of Leeds and Fiona Douglas, University of Leeds

We need a corpus of Arabic dialects. But what mode should we aim to capture: the spoken form, the written form, or some combination? And how should we transcribe spoken Arabic? Can we learn any lessons from established corpus-building practice?

Many of the early corpora of English (e.g. Brown, LOB, ACE) included only published Standard English written texts. Later corpora, such as ICE, included significant proportions of spoken text, but transcribed using Standard English orthography. The Survey of English Dialects aimed to capture dialect and pronunciation using audio recordings and IPA renderings of individual words, but IPA is unintelligible to all but the specialist reader. Eye-dialect is more accessible, but raises issues of authenticity, reliability, and comparability across texts. And so the majority of present-day corpora of English have standard written texts as their mainstay; those attempting to include dialect forms are in the minority.

Modern Standard Arabic is the official written form worldwide. Analogous to the Brown family corpora, the International Corpus of Arabic project is collecting “samples of written Modern Standard Arabic (MSA) selected from a wide range of sources” from every Arab country. Web-corpora such as the Leeds Arabic Web Corpus, OUP Arabic Corpus or SketchEngine ArTenTen contain text from web-pages, including some blogs and other informal sources; they try to capture dialect forms, but have little or no standardisation of orthography. Although Arabic dialects are increasingly used on TV (e.g. soap operas), in internet blogs, videos, personal emails, and conversation, there is no standard written dialect. 3arabizi, Latin-scripted Arabic used in some informal dialect sources, and Arabic’s complex morphology are further complications for the would-be dialect corpus builder. In addition to their intrinsic linguistic interest, these dialect varieties could be commercially significant: companies such as Google, Facebook, and Amazon using dialect text analytics to localise and personalise their advertising; researchers and industry needing empirical dialect data to train Machine Learning classifiers for online Arabic text. But currently, there is no purpose-built Arabic dialects corpus.

We propose a General Architecture for Dialect Analytics to bring together best practice, corpus resources, and tools for English and Arabic dialect analytics. GADA addresses the multimodality of dialect by including parallel exploration of: audio recordings, IPA and variant phonetic transcripts, 3arabizi and Kerl non-standard semi-phonetic orthography, standard spelling transcriptions, and capture of dialect from YouTube. GADA will enable us to identify parallels and key differences between English and Arabic dialect studies.

NOOR multimodal research on the language of the Quran

Mohamed Menacer, Taibah University

The NOOR Research Centre for Quran Computing has been established as a focus for research on computer processing and analysis of the sounds, text and images of the Quran, the central holy book of Islam. NOOR researchers have catalogued and brought together a wide range of past and current Quran computing research. One fundamental issue is: which is the primary or central modality for verified transmission of the Quran? The Arabic text is widely available on the Web, but many versions are corrupted or inaccurate. One answer is to hold a bank of images of the recognised authentic Quranic Arabic script as the primary source, and to use an image processing verification approach to validate online copies of the Quran. This approach is approved by Quran scholars who recognise a single written original source as the basis for true transmission of the correct script. However, Muslims hold that the Quran verses were first transmitted orally to the prophet Mohammed who then passed them on orally; so audio recitation recordings may also be a primary source. Computer speech processing enables new ways to model the oral recitation of the Quran, and map the audio signal the text and image streams. Some scholars see yet another modality as primary: arguably the most important aspect of the Quran is its message or meaning, rather than the text, sounds or images used to convey the meaning. Research in computational semantics, knowledge representation and ontologies applied to the Quran gives us another new modality to explore and link to sound, text and image streams.

Processing diacritics in Arabic: the results of an eye movement tracking study

Ehab W. Hermena, University of Southampton; Denis Drieghe, University of Southampton;  Sam Hellmuth, University of York; Simon P. Liversedge, University of Southampton

In written Arabic, short vowel sounds are conveyed by diacritics, but most texts are printed without diacritisation. Native Arabic readers use context to disambiguate, but editors sometimes provide diacritics in ‘critical’ words (homographs whose ambiguity risks altering overall sentence meaning). Materials developed for second language learners of Arabic vary in their provision of diacritics, from full vowelisation to none, but vowelisation of critical words only appears to be rare.

We report the results of a study (Hermena et al, forthcoming) which used eye-movement tracking to investigate how and when native readers access diacritics-based phonology to disambiguate homographic verbs in the active vs. passive (e.g. ضرب /daraba/, hit; ضرب /doriba/, was hit). We tracked the eye movements of 25 native Arabic readers while reading sentences in five conditions: active fully-diacritised, active non-diacritised, passive fully-diacritised, passive non-diacritised and passive verb-only diacritisation. Sentences were designed to trigger a ‘garden-path’ effect: without diacritics on the verb, readers would not be able to discriminate passive from active until a prepositional phrase (e.g. بيد /biyad/, by the hand of) late in the sentence.

Results show that native readers effectively extract the phonological information conveyed by diacritics on the verb (only): early reading times on the disambiguating region were significantly reduced in passive verb-only diacritisation condition. In contrast, however, the ‘garden path’ effect was not reduced in fully-diacritised passive sentences, suggesting that native readers of Arabic treat the information provided by full-diacritisation as essentially redundant. Implications for development of materials for second language learners will be discussed.


Hermena, E.W., Drieghe, D., Hellmuth, S., and Liversedge, S. P. (forthcoming). Eye movements investigation of processing diacritics in Arabic: disambiguating homographic verbs and the impact of full sentence diacritisation. Ms., University of Southampton.

“Curl round over the top of the alpine garden”: Achieving (Mis)understanding(s) in Interpreted Interaction involving Sign Language

Victoria Crawley, York St John University; Andrew Merrison, York St John University

This paper will demonstrate that the transfer of meaning in an interpreted event involves bi-lingual, bi-cultural and bi-modal knowledge on the part of a participating Signed Language interpreter. We consider issues such as: how the interpreter deals with simultaneous starts but bi-modal displays (for example, the hearing person takes an inbreath as a preface to a turn, and the Deaf person raises their hands as a preface to theirs – one is auditory, the other visual); the inherent ambiguity in much oral vocabulary (for example, the word “talk” can mean simply ‘speak’, ‘a short lecture’, or ‘a confrontational meeting at the break up of a relationship’); and the fact that under-specification is often not a viable option in Sign Language (the opening of a door must be inwards or outwards; the size of a person may need to be included in the verb “sit”; and during reported speech, the height of both parties would be encoded). All these meanings would need to be made explicit by the interpreter. These are some of the issues that we address to show that the bi-cultural, bi-lingual and bi-modal interpreter not just should, but rather must interact as a participant in the dialogues they facilitate.

Cn U read ths? Creative spellings in subtitling

Alina Secara, University of Leeds

Research into social translation practices (O’Hagan, 2008, Perrino, 2009), as well as studies in fields as diverse as patent reviews, journalism and computing (Howe, 2008) have investigated differences in practice between professional and non-traditional settings, and observed the richness of techniques and approaches used in the latter. In this presentation I will investigate the use of non-standard txt language spellings in subtitling, arguing that they can be advantageously used in the professional subtitling practice for a specific medium as they allow a certain liberation from formal audiovisual translation constraints (e.g. maximum number of characters allowed per line). The results of an eye-tracking experiment set to elicit data on the reception of subtitling containing txt language by typical consumers will also be presented. Typical indicators such average fixation duration, number of back and forth shifts (between image and subtitle) and regressive eye movements will be statistically investigated. This study is designed to motivate comments on the validity of integration of creative subtitles in specific environments as well as on the readability of textisms in subtitling.


Howe, J. (2008). Crowdsourcing. How the power of the crowd is driving the future of business. London : RH Business Books.

O’Hagan, M. (2008). Fan translation networks: an accidental translator training environment?. In J. Kearns (Ed.). Translator and interpreter training: Issues, methods and debates (pp.159-183). London, New York: Continuum.

Perrino, Saverio (2009). User-generated translation: The future of translation in a Web 2.0 environment. The Journal of Specialised Translation 11. Retrieved March, 14, 2014, from: http://www.jostrans.org/issue12/art_perrino.php.

Multimodality, accessibility and extratitles: a reception study

Sara Ramos Pinto, University of Leeds

In an audiovisual product, the verbal code is arguably the mode that presents more challenges to a foreign audience not capable of understanding the source language. However, the visual mode also participates in the construction of meaning and some of the visual elements (together with their sociocultural meaning) might present serious challenges to a foreign audience who might not be able to recognise or interpret the meaning of certain objects, situations or behaviours on screen.

In this context, translators play a crucial role in guaranteeing communication and enhancing the accessibility of multimodal products. Previous studies (Ortabasi 2001; Bucaria and Chiaro 2007; Cavaliere 2008) have, however, concluded on the loss of meaning brought about by standard subtitling and the lack of accessibility illustrated by the sense of confusion among viewers when faced with visual or verbal elements/references they are not able to interpret. This paper will present the results of an experimental reception study (using eye-tracking and questionnaire) focused on investigating the impact the viewers’ cognitive load, interpretation and assessment of innovative subtitling practices such as the use of additional titles. The results show that, even though the use of additional titles demands a higher cognitive load, viewers seem to make a very positive assessment in result of an easier interpretation of the references made in the film.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s