Venkatramanan Arun, Karan Sawhney, Udit Mehta
Introduction
Television has always been a source of excitement and a form of entertainment.
Over the years, they have also slowly invaded our living rooms. Yet, the experience has become a passive activity and much of the initial excitement that the television induced in the early years has been lost. The television has lost its unique experience of entertaining and creating an immersive experience in the living room.
Our proposal deals with developing a product that changes the user’s television viewing experience. To make the television viewing experience more interactive and immersive, we present a set-top-box of the future capable of allowing a user to interact with a TV show of their choice. We developed a system that is synchronized with sequences in a movie or a pre-recorded sports game and Integrate physical aspects such as lighting, haptic feedback and synthesized sound effects so that the entire room is activated when an appropriate context-aware video signal is received (e.g. When the Philliy’s hit a home run, the living room will light up and the team song will play in the background).
The introduction of tablet PCs and online streaming of videos, have created an environment where users have multiple sources of entertainment and continuously shift from one source to another essentially breaking down the living room dynamic. Demands of viewers have evolved over the years and moving from one source of content consumption to another has created a break in viewership.
Our solution will focus on the authoring and runtime system for such embedded digital content; the design and development of the overall set-top-box system architecture; and demonstration of the integrated system with popular TV shows. The Immersive Ambience application will deliver a rich viewing experience with autonomous solutions like gesture recognition, appliances control, mood lighting and integrating it all on one box. In this way we can merge the virtual and the real world while demonstrating the power of the platform.
Motivation
TV watching is a passive activity and will remain a passive activity. The current TV viewing experience does not highlight the significance of the final moments of a sports game or a movie’s climax. The current TV viewing experience only involves a TV screen and a remote control in a loop. The challenge here is to integrate more elements into the loop to redefine the TV viewing experience. Traditional TV puts the viewer to sleep right before exciting moments in a TV show or sports game. About 60 % of all US households own a connected TV and a large fraction of these connections is web capable. With the rise of computers capable of streaming videos in high definition, televisions would slowly be obsolete to the society in a few decades.
Cyber-Physical Processing / Hardware API
We created a more immersive experience by allowing the video stream to embed context PIDs that have more relevance to the events within the TV show or that would appeal to the viewer. The project incorporates 5 features to create the immersive experience we are aiming to achieve. The i-Vest, Ambient Lighting and the backlight system are controlled by slave devices which receive the commands from the set top box. The hardware includes:
1. The i-Vest:
This haptic feedback device is designed to increase the immersiveness of action sequences in movies and TV shows. The vest includes a multitude of solenoids and eccentric-mass motors controlled by custom electronics and a personal microcontroller (Mbed). When the person on screen experiences a gun shot, the vest simulates the appropriate sensation (bullet hit, vibration, and/or heat) in the correct location on the viewer’s body.
The circuit we used to drive the motors and solenoids is shown here. At the base, PWM signal from the Mbed is supplied which drives the solenoid and motors during the ON period.
2. Ambient Lighting and Sounds:
The subsystem is designed to provide the appropriate lighting and sound environment based on the sequence on the TV. For eg : the Lights turn on during the half time in a sports game or produce rumbling sounds to indicate a goal in a soccer game. To demonstrate this feature, we used LED strips which were controlled by a slave device receiving the commands from the master controller.
Each segment of 3 LED's draws approximately 20 milliAmperes from 12 V supply. So for each segment, there is a maximum 20mA draw from the red,green and blue LEDs each. So if the LED strip is on full white, it would be 60mA per segment.
To drive the red, green and blue leds, we varied the PWM from the mbed pins.
![]() |
| Led Strip Schematic |
3. Frame + Context Screens
Focus plus context screens consist of a hi-res display and a larger low-res display. Image content is displayed such that the scaling of the display content is preserved, while the resolution may vary according to which display region it is displayed in. Focus plus context screens can be used for several applications like editing print products, finding the shortest path on the map, simulation games.
We have used a LCD, projector along with the "Focus plus context screen" technology to add an important feature our project to enhance the immersive viewing of the video content on the users television.
This feature displays a part of the video in the LCD screen (hi-res display) and displays the rest around the LCD screen (low-res display). This gives the user an illusion that the content in the LCD screen is extended outside into his living room. For example, if someone is watching the movie 'Finding Nemo', then this feature enables the user to focus on the main character of the movie and extends the rest of the video around the television into his living room, there by creating a virtual world in his living room.
We implemented this using a computer that acts like the server, a LCD screen which is the focus and a projector which is the context. The block diagram showing the implementation is as follows. The input here is a movie clip.
4. Surround Visualization
To change the ambience of the user's living room and give him/her the feeling that he/she is a part of the video that he/she is watching on the television we have added a feature called extended surrounding.
This feature changes the ambience of the room based on the content being displayed in the video. Thereby making the user feel as if he/she is a part of the video that he is watching.
5. Backlight System - Ambilight
Ambilight is a lighting system that actively adjusts both brightness and color based upon picture content. To implement this feature, we divided the television into 64 segments and calculated the average color in each segment. These segments then correspond to a led on the led strip which is wrapped around the television. The led then emits the color based on the rgb value is receives from the controller. The controller accepts streaming rgb values which it then uses to send it over using SPI to the led strip.
The ambilight system is considerably reduces the pressure on the eyes when there is a transition from a dark to a light scene. It also enhances the field of view by projecting light into the unfocussed area of the eye.
The schematic for the ambilight system is here:
This includes the embedding, tagging and linking of metadata within the video stream as Program Identifiers (PIDs). For a given video stream, the PIDs will be processed to have on-screen and cyber-physical responses. Metadata for each video is generated which is fetched when the video is played. This metadata contains the overlay content for each video to indicate the actuations and effects associated with that video. The metadata is different for sports and movies since different actuations are involved in both. Each slave device is given an address and only messages addressed to that slave are processed, otherwise the message is dropped.
For the communication between the master and the slave devices, we are exchanging open sound control messages using the BMAC protocol.
Open Sound Control Messaging
For the communication between the master and the multiple slaves involved in the project, we implemented the Open Sound Control (OSC) messaging protocol. OSC is a content format for messaging among computers and multimedia devices which is comparable to XML and JSON formats.
When
When an OSC server receives an OSC Message, it must invoke the appropriate OSC Methods in its OSC Address Space based on the OSC Message's OSC Address Pattern. This process is called dispatching the OSC Message to the OSC Methods that match its OSC Address Pattern. All
the matching OSC Methods are invoked with the same argument data, namely, the OSC Arguments in the OSC Message.
The OSC message format:
The OSC message consists of an address pattern, a type tag and zero or more arguments.
Our application has a master communicating with multiple slaves using the OSC protocol. It resembles invoking a remote procedure call to the slave by the master.
The message format:
The message format specifies that each OSC message is of 8 bytes with the fields as specified below.
Fields:
Address: This field specifies the address of the slave which is meant to receive the message. Each slave checks this address and if the message is meant for itself, it unpacks the rest of the message otherwise it drops the message.
Sub Address: This field specifies the function on the slave which we need to call. The slave will read the sub address and call the appropriate function with the parameters attached with the message.
Start Type: This field indicates the end of the address and start of the type tags.(usually a ',')
Parameters: This field contains the parameters required for the function mentioned in the sub address. It can also be a time tag indicating the time in seconds which may be required by the slave.(Eg if it plays a musical instrument, this helps in knowing for how long to play it)
OSC Message Types:
Performance Measures
Quantitative metrics which were used for the measurements include:
1. Communication: The amount of time delay between gathering visual data and sending it from the master base station to the slave components.
2. Cyber-Physical System performance: controllability of lighting, manipulation of synthesized sound and haptic feedback devices.
3. System Integration: Ability of the entire system as a whole in creating an immersive experience.
4. Real Time Response: The effects should take place in real time without any time lag.
5. Cost: The total cost of the system.
Component List
System Architecture
Conclusion
This project has evolved from where it was left previously. We worked on creating as many effects and actuations we could and ended up with 5 distinct features.
The biggest challenge was to develop a messaging API so that our system could be a plug-n-play system with ease of attachment and detachment of new devices. To address this challenge, we implemented the OSC messaging protocol which helped us categorize the slaves and address the slave and its sub-device effectively.
Another challenge was to develop the backlight system since we first had to grab the average color of each frame on the video. After this we had to make our slave accept streaming rgb values and actuate the led strip accordingly.
The frame plus context screen requires some work in synchronization so that we can start the video on the set top box and the projector simultaneously.
The project can be developed even further by recognizing events in the streaming video rather than manually entering tags. This would make it perfect for live streaming of sports.
Finally, our solution will provide a low-cost, efficient solution to build an interactive TV around an actuated environment to enhance the user’s experience.











No comments:
Post a Comment