Journal of Computer Sciences and Applications
ISSN (Print): 2328-7268 ISSN (Online): 2328-725X Website: Editor-in-chief: Minhua Ma, Patricia Goncalves
Open Access
Journal Browser
Journal of Computer Sciences and Applications. 2019, 7(1), 1-9
DOI: 10.12691/jcsa-7-1-1
Open AccessReview Article

TV Stream Table of Content: A New Level in the Hierarchical Video Representation

Zein Al Abidin Ibrahim1,

1Lebanese University, Faculty of Sciences, Section I, Beirut, Lebanon

Pub. Date: December 28, 2018

Cite this paper:
Zein Al Abidin Ibrahim. TV Stream Table of Content: A New Level in the Hierarchical Video Representation. Journal of Computer Sciences and Applications. 2019; 7(1):1-9. doi: 10.12691/jcsa-7-1-1


With the rapid development of nowadays technologies, TV could keep its position as one of the most important entertainment and sometimes educative utilities in our daily life. However, keeping this position required a lot of major changes to take place in order for the TV to follow up with the digital revolution, such as, digital broadcasting, High Definition TV, TV on demand. TV-REPLAY, WebTV, etc. This evolution accompanied with many other factors such as the vast spread of communication means and the low prices of storing media have all resulted in many other indispensable technologies for video content storing, structuring, searching and retrieval. Video content can be of various types: a sequence of frames, a sequence of shots, a sequence of scenes, or a sequence of programs which is what the TV stream is usually composed of. Video content structuring would be of a great benefit to help indexing searching and retrieving information from the content efficiently. For example, structuring a soccer game into Play/Break phases facilitates later the detection of goals or summarizing the soccer video. Another example is to structure a news program into stories where each story is composed of an anchorperson segment followed by a report, which facilitates later the search of a specific story or an intelligent navigation inside the news program. However, all the existing analysis methods are dedicated for one type of video content. Such methods generate very poor results if it is applied on a TV stream that is composed of several video programs. So, it is important to detect a priori the boundaries of each program and then identify the type of each program in order to run the dedicated analysis method based on the type. For a TV viewer, a TV stream is a sequence of programs (P) and breaks (B). Programs may be separated by breaks and may include also breaks. For analysis purpose, the stream can be considered as a sequence of audio and video frames with no markers of the start and end points of the included programs or breaks. Most of TV channels that produce TV streams provide a program guide about the broadcasted programs. However, such guides usually lack precision, especially with the existence of live programs which makes the prediction of their start and end very hard. Moreover, program guides do not include any information about the breaks (i.e. commercials). Hence, one of the important steps to structure TV video content is to segment it into different programs and then choose the appropriate method to segment each program separately based on its type. The TV stream structuring consists in detecting the start and end of all the programs and breaks in the stream and later trying to annotate automatically each program by some metadata that summarizes its content or identifies its type. This step can be performed by analyzing the metadata provided with the stream (EPG or EIT), or analyzing the audio-visual stream itself. In this article, we define what we call TvToC (TV stream table of content) that adds a new level in the hierarchical video decomposition (traditional video ToC). Then, we provide a comparative study of all the methods and techniques in the domain of TV stream segmentation. Besides, a comparison of the different approaches is done to highlight the advantages and the weaknesses of each of them.

TV stream structuring video structuring near duplicate detection classification

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit


Figure of 2


[1]  Rui, Y., Huang, T. S. and Mehrotra, S., “Constructing Table of Content for Videos,” Journal of Multimedia Systems, 7(5), 359-368, September 1999.
[2]  Manson, G. and Berrani, S. A., “Automatic TV Broadcast Structuring,” International journal of Digital Multimedia Broadcasting, vol. 2010, January 2010.
[3]  Wang, M. and Zhang, H., “Video Content Structuring,” Scholarpedia Journal, 4(8): 9431, 2009.
[4]  Hanjalic, A., “Shot-boundary detection: unraveled and resolved?,” IEEE Transactions on Circuits and Systems for Video Technology, 12(2), 90-105, February 2002.
[5]  Lienhart, R., “Reliable Transition Detection in Videos: a Survey and Practitioner’s Guide,” International Journal on Image Graphics, 1(3), 469-486, July 2001.
[6]  Koprinska, I. and Carrato, S., “Temporal Video Segmentation: a Survey,” Signal processing: Image Communication Journal, 16(5), 477-500, January 2001.
[7]  Rui, Y., Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T. S., “A Unified Framework for Video Summarization, Browsing and Retrieval,” Technical report, MERL Technical Report, September 2004.
[8]  Snoek, C. G. M. and Worring, M., “Multimodal Video Indexing: A Review of the State-of-the-art,” Multimedia Tools and Applications, 25(1), 5-35, January 2005.
[9]  Wilson, K. W. and Divakaran, A., Broadcast Video Content Segmentation by Supervised Learning, Divakaran A. (eds) Multimedia Content Analysis. Signals and Communication Technology. Springer, Boston, USA 1-17.
[10]  Brezeale, D. and Cook, D. G., “Automatic video classification: A survey of the literature,” IEEE Transactions on Systems, Man and Cybernetics, 38(3), 416–430, May 2008.
[11]  Roach, M., Mason, J., Xu, L. and Stentiford, F., “Recent Trends in Video Analysis: a Taxonomy of Video Classification Problems,” in Proceedings of the International Conference on Internet and Multimedia Systems and Applications, IASTED, Hawaii, USA, August 2002.
[12]  D’Orazio, T. and Leo, M., “A review of vision-based systems for soccer video analysis,” Pattern Recognition, 48(8), 2911-2926, August 2010.
[13]  Tjondronegoro, D., Chen, Y.-P. P. and Pham, B., “The Power of Play-Break for Automatic Detection and Browsing of Self-Consumable Sport Video Highlights,” in 6th International ACM Multimedia Information Retrieval Workshop (MIR'04), New York, USA, October 2004.
[14]  Xie, L., Xu, P., Chang, S.-F., Divakaran, A. and Sun, H., “Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models,” Pattern Recognition Letters, 25(7), 767-775, May 2004.
[15]  Ali, M. H., Shafie, A. A., Fadhlan, H. and Roslizar, M. A., “Advance Video Analysis System and its Applications,” European Journal of Scientific Research, 41(1), 72-83, 2010.
[16]  Liang, L., Lu, H., X. Xue, and Y. P. Tan. “Program Segmentation for TV Videos,” in Proceedings of the IEEE International Symposium on Circuits and Systems, 1549-1552, Kobe, Japan, May 2005.
[17]  Naturel, X., Gravier, G. and Gros, P., “Fast Structuring of Large Television Streams using Program Guides,” in Proceedings of the 4th International Workshop on Adaptive Multimedia Retrieval, 223-232, Geneva, Switzerland, March 2006.
[18]  Ibrahim, Z. A. A., Gros, P. and Campion, S., “AVSST: an Automatic Video Stream Structuring Tool,” in Third Networked and Electronic Media Summit, Barcelona, Spain, October 2010.
[19]  El-Khoury, E., Senac, C. and Joly, P., “Unsupervised Segmentation Methods of TV Contents,” International Journal of Digital Multimedia Broadcasting, Hindawi Publishing Corporation, vol. 2010, March 2010.
[20]  Ibrahim, Z. A. A., Ferrane, I. and Joly. P., “A Similarity-Based Approach for Audiovisual Document Classification Using Temporal Relation Analysis,” EURASIP Journal on Image and Video Processing, vol. 2011, March 2011.
[21]  Wang, J., Duan, L., Liu, Q., Lu, H. and Jin, J. S., “A Multimodal Scheme for program Segmentation and Representation in Broadcast Video Streams,” IEEE Transactions on Multimedia, 10(3), 223-232, Geneva, Switzerland, March 2006.
[22]  El-Khoury, E., Senac, C. and Joly, P., “Speaker Diarization: Towards a more Robust and Portable System,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 489-492, Hawaii, USA, June 2007.
[23]  Zeng, Z., Zhang, S., Zheng, H. and Yang, W., “Program Segmentation in a Television Stream using Acoustic Cues,” in Proceedings of the International Conference on Audio, Language and Image Processing, 748-752, Shangai, China, July 2008.
[24]  Guinaudeau, C., Gravier, G. and Sébillot, P., “Improving ASR-based Topic Segmentation of TV Programs with Confidence Measures and Semantic Relations,” in 11th Annual Conference of the International Speech Communication Association, Interspeech' 10, 1365-1368, Makuhari, Japan, September 2010.
[25]  Poli, J. P., “An Automatic Television Stream Structuring System for Television Archives Holders,” Journal of Multimedia Systems, 14(5), 255-275, September 2008.
[26]  Haidar, S., “Comparaison des Document Audiovisuels par Matrice de Similarité, “ PHD Thesis, University of Toulouse 3 – Paul Sabatier, September 2005.
[27]  Hmayda, M., Ejbali, R. and Zaied, M., “Program Classification in a Stream TV using Deep Learning,” in 18th Conference on PDCAT, 123-126, Taipei, Taiwan, December 2017.
[28]  Nat06
[29]  Ardissono, L., Gena, C. Torasso, P. Bellifemine, F. Chiarotto, A. Difino, A. and Negro, B., “Personalized Recommendation of TV Programs,” in 8th Congress of the Italian Association for Artificial Intelligence, Springer-Verlag Berlin Heidelberg, Pisa, Italy, 2003.
[30]  Lee, H., Kim, J. G., Yang, S. J. and Hong, J., “Personalized TV Services based on TV-anytime for personal Digital Recorder,” IEEE Transactions on Consumer Electronics, 51(3), 885-892, August 2005.
[31]  Nickum, L. A., “Personal Preferred Viewing using Electronic Program Guide,” US Patent, Numb. 7617512, url: “”, November 2009.
[32]  Rovira, M., Gonzalez, J., Lopez, A., Mas, J., Puig, A., Fabregat, J. and Fernandez, G., “Indextv: a MPEG7 based Personilized Recommendation System for Digital TV,” in IEE International Conference on Multimedia and Expo, 823-826, Taipei, Taiwan, June 2004.
[33]  Kawai, Y., Sumiyoshi, H. and Yagi, N., “Automated Production of TV Program Trailer using Electronic Program Guide,” in Proceedings of the ACM International Conference on Image and Video Retrieval, 49-56, Amsterdam, Netherland, July 2007.
[34]  Liu, Z., Gibbon, D. C. and Shahraray, B., “Multimedia Content Acquisition and Processing in the Miracle System,” in IEEE Consumer Communications and Networking Conference, 272-276, Las Vegas, USA, January 2006.
[35]  Haidar, S., Joly, P. and Chebaro, B., “Mining for Video Production Invariants to Measure Style Similarity,” International Journal of Intelligent Systems, Wiley, 21(7), 747-763, July 2006.
[36]  Albiol, A., Fulla, M. J., Albiol, A. and Torres, L., “Detection of TV Commercials,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 541-544, Québec, Canada, May 2004.
[37]  Dimitrova, N., Jeannin, S., Nesvadba, J., McGee, T., Agnihotri, L. and Mekenkamp, G., “Real Time Commercial Detection using MPEG Features,” in Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems, 1-6, Annecy, France, July 2002.
[38]  Duan, L. Y., Wang, J., Zheng, Y. Jin, J. S., Lu, H. and Xu, C., “Segmentation, Categorization, and Identification of Commercial Clips from TV Streams using Multimodal Analysis,” in Proceedings of the 14th annual ACM international conference on Multimedia, Santa Barbara, USA, October 2006.
[39]  Hua, X. S., Lu, L. and Zhang, H. J., “Robust Learning-Based TV Commercial Detection,” in Proceedings of the IEEE International Conference on Multimedia and Expo, 149-152, Amsterdam, Netherlands, July 2005.
[40]  Lienhart, R., Kuhmunch, C. and Effelsberg, W., “On the Detection and Recognition of Television Commercials,” in Proceedings of the IEEE international Conference on Multimedia Computing and Systems, 509-516, Ontario, Canada, June 1997.
[41]  McGee, T. and Dimitrova, N., “Parsing TV Program Structures for Identification and Removal of Non-story Segments,” in Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases, 243-251, California, USA, January 1999.
[42]  Sadlier, S. A., Marlow, S., O’Connor, N. and Murphy, N., “Automatic TV Advertisement Detection from Mpeg Bitstream,” Journal of Pattern Recognition Society, 35(12), 2719-2726, January 2002.
[43]  Sanchez, J. M., Binefa, X. and Vitria, J., “Shot Partitioning based Recognition of TV Commercials,” Journal of Multimedia Tools and Applications, 8(3), 233-247, December 2002.
[44]  Duygulu, P., Chen, M. Y. and Hauptmann, A., “Comparison and Combination of two Novel Commercial Detection Methods,” in Proceedings of the IEEE International Conference on Multimedia and Expo, 1267-1270, Taipei, Taiwan, June 2004.
[45]  Gauch, G. M., and Shivadas, A., “Finding and Identifying Unknown Commercials using Repeated Video Sequence Detection,” Journal of Computer Vision and Image Understanding, 103(1), 80-88, July 2006.
[46]  Covell, M., Baluja, S. and Fink, M., “Advertisement Detection and Replacement using Acoustic and Visual Repetition,” in Proceedings of the IEEE International Workshop on Multimedia Signal Processing, 461-466, Victoria, Canada, October 2006.
[47]  Naturel, X. and Gros, P., “Detecting Repeats for Video Structuring,” Multimedia Tools and Application, 38(2), 233-252, May 2008.
[48]  Yuan, J., Gravier, G., Campion, S., Liu, X. and Jegou, H., “Efficient Mining of Repetitions in Large-Scale TV Streams with Product Quantization Hashing,“ in Workshop on Web-Scale Vision and Social Media, in conjunction with ECCV, 271-280, Firenze, Italy, October 2012.
[49]  Joly, P., Benois-Pineau, J., Kijak, E. and Quenot, G., “The ARGOS Campaign: Evaluation of Video Analysis and Indexing Tools,” Signal Processing: Image Communication, Special Issue on Content-based Multimedia Indexing and Retrieval, 22(7-8), 705-717, September 2007.