Aqua Phoenix
     >>  Presentations >>  Candidacy Exam  


3. List of Papers

    3.1 Visual Cues

      Mukhopadhyay 99   S. Mukhopadhyay, B. Smith, Passive capture and structuring of lectures. Proceedings of the 7th ACM International Conference on Multimedia, Orlando, FL, 1999, pp. 477-487
      Dorai 03   C. Dorai, V. Oria, V. Neelavalli. Structuralizing educational videos based on presentation content. Proceedings of the 10th International Conference on Image Processing, Barcelona, Spain, 2003.
      Smith 98   M.A. Smith, T. Kanade. Video skimming and characterization through the combination of image and language understanding. Proceedings of the 1998 IEEE International Workshop on Content-Based Access of Image and Video Databases, Bombay, India, 1998, pp. 61-70.
      Souvannavong 04   F. Souvannavong, B. Merialdo, B. Huet. Latent semantic indexing for video content modeling and analysis. Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, New York, NY, 2004, pp. 243-250.

    3.2 Audio Cues

      Fujii 03   A. Fujii, K. Itou, T. Akiba, T. Ishikawa, A cross-media retrieval system for lecture videos. Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), Geneva, Switzerland, 2003, pp. 1149-1152.
      Glass 04   J. Glass, T.J. Hazen, L. Hetherington, C. Wang. Analysis and processing of lecture audio data: preliminary investigations. Proceedings of the HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, Boston, MA, 2004, pp. 9-12.
      Witbrock 97   M. Witbrock, A.G. Hauptmann. Speech recognition and information retrieval: experiments in retrieving spoken documents. Proceedings of the 1997 DARPA Speech Recognition Workshop, Chantilly, VA, February 2-5, 1997, pp. 160-164.
      Chen 98   S.S. Chen, P.S. Gopalakrishnan. Speaker, environment and channel detection and clustering via the Bayesian information criterion. Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Landsdowne, VA, 1998, pp. 127-132.

    3.3 Textual Cues

      Lin 04   M. Lin, J.F. Nunamaker, M. Chau, H. Chen. Segmentation of lecture videos based on text: a method combining multiple linguistic features. Proceedings of the 37th Hawaii International Conference on System Sciences, Big Island, Hawaii, 2004, pp. 3-11.
      Ponceleon 01   D. Ponceleon, S. Srinivasan. Automatic discovery of salient segments in imperfect speech transcripts. Proceedings of 10th International Conference on Information Knowledge Management (CIKM ‘01), Atlanta, GA, 2001, pp. 490-497.
      Yang 03   H. Yang, L. Chaisorn, Y. Zhao, S. Neo, T. Chua. VideoQA: question answering on news video. Proceedings of the 11th ACM International Multimedia Conference and Exhibition (MM 03), Berkeley, California, 2003, pp. 632-641.
      Landauer 98   T.K. Landauer, P.W. Foltz, D. Laham. An introduction to latent semantic analysis. Discourse Processes Journal, Volume 25, 1998, pp. 259-284.
      Yang 96   Y. Yang, J. Wilbur. Using corpus statistics to remove redundant words in text categorization. Journal of the American Society for Information Science, Volume 47, Issue 5, May, 1996, pp. 357-369.

    3.4 Audio/Video versus Structure, Cross Referencing

      Hauptmann 03   A. G. Hauptmann, R. Jin, T. D. Ng. Video retrieval using speech and image information. Proceedings of 15th Electronic Imaging Conference, Santa Clara, CA, 2003.
      Liu 98   Z. Liu, Y. Wang, T. Chen. Audio feature extraction and analysis for scene segmentation and classification. Journal of VLSI Signal Processing Systems archive, Volume 20, Issue 1-2, October 1998, pp. 61-79.
      Sundaram 00   H. Sundaram, S.-F. Chang. Determining computable scenes in films and their structures using audio-visual memory models. Proceedings of the 8th ACM international conference on Multimedia, Los Angeles, CA, 2000, pp. 95-104.
      Syeda-Mahmood 00   T. Syeda-Mahmood, S. Srinivasan. Detecting topical events in digital video. Proceedings of the 8th ACM International Conference on Multimedia, Los Angeles, CA, pp. 85-94.
      Waibel 03   A. Waibel, T. Schultz, M. Bett, M. Denecke, R. Malkin, I. Rogina, R. Stiefelhagen, J.Yang. SMaRT: the smart meeting room task at ISL. Proceedings of the 2003 International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, China, 2003, pp. IV 752-IV 755. [ conference not held due to SARS ]

    3.5 User Interfaces

      Altman 02   E. Altman, Y. Chen, W.C. Low, Semantic exploration of lecture videos. Proceedings of the 10th ACM International Conference on Multimedia, Juan-les-Pins, France, 2002, pp. 416-417.
      Whittaker 99   S. Whittaker, J. Hirschberg, J. Choi, D. Hindle, F. Pereira, A. Singhal. SCAN: Designing and evaluating user interfaces to support retrieval from speech archives. Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR ’99), Berkeley, CA, 1999, pp. 26-33.
      Young 97   S.J. Young, M.G. Brown, J.T. Foote, G.J.F. Jones, K.S. Jones. Acoustic indexing for multimedia retrieval and browsing. Proceedings of the 22nd International Conference on Acoustics, Speech and Signal Processing (ICASSP 97), Munich, Germany, 1997, pp. 1:199-202.
      Girgensohn 01   A. Girgensohn, J. Boreczky, L. Wilcox. Keyframe-based user interfaces for digital video. IEEE Computer, Volume 34, Number 9, September 2001, pp. 61-67.
      Worring 04   M. Worring, G.P. Nguyen, L. Hollink, J.C. van Gemert, D.C. Koelma. Accessing video archives using interactive search. Proceedings of the 2004 International Conference on Multimedia and Expo, Taipei, Taiwan, 2004, pp. ??

    3.6 Case Studies, Evaluation

      Lee 99   H. Lee, A.F. Smeaton, J. Furner. User interface issues for browsing digital video. Proceedings of the 21st Colloquium on Information Retrieval (IRSG ?99), Glasgow, UK, 1999.
      Abowd 96   G.D. Abowd, C.G. Atkeson, A. Feinstein, C. Hmelo, R. Kooper, S. Long, N. Sawhney, M. Tani. Teaching and learning as multimedia authoring: The classroom 2000 project. Proceedings of the 4th ACM International Multimedia Conference and Exhibition (MULTIMEDIA 96), Boston, Massachusetts, 1996, pp. 187-198.
      Brotherton 04   J.A. Brotherton, G.D. Abowd. Lessons learned from eClass: assessing automated capture and access in the classroom. ACM Transactions on Computer-Human Interaction, Vol. 11, No. 2, June 2004, pp. 121-155.
      Li 00   F.C. Li, A. Gupta, E. Sanocki, L. He, Y. Rui. Browsing digital video. Proceedings of the Conference on Human Factors in Computing Systems (CHI ‘00), The Hague, Netherlands, 2000, pp. 169-176.
      Moran 97   T.P. Moran, L. Palen, S. Harrison, P. Chiu, D. Kimber, S. Minneman, W. van Melle, P. Zellweger. “I’ll get that off the audio”: A case study of salvaging multimedia meeting records. Proceedings of the Conference on Human Factors in Computing Systems (CHI ‘97), Atlanta, GA, 1997, pp. 22-27.
      Vendrig 02   J. Vendrig, M. Worring. Evaluation measurement for logical story unit segmentation in video sequences. IEEE Transactions on Multimedia, Vol. 4, No. 4, December 2002, pp. 492-499.