Friday, August 26, 2011

VoIP: deja-vu 30 years later

These days I am working for my client on quite interesting project which is utilizing live video chat  over IP network. Fun part here is I had to implement this using Java at least from controlling perspective - establish and control RTP stream capture and playback.
After couple of hours of investigation gstreamer project appeared to be quite useful here which enables building of complex media recording and playback functionality using method very similar to Unix command line pipes. Using java binding project gstreamer-java I can build and control audio/video pipe from Java environment which looks great. I don't event have to deal with complexities of RTP and RTCP protocols since relevant RTP plugins are already available. I could learn that established IM solutions like Telepathy or Pidgin (known especially in Linux circles) use gstreamer (gstreamer-farsight) to render audio/video chat, too. Simply put, I couldn't find any examples of ready pipeline which is full-duplex, audio and video interleaved, having video of remote party mixed with own live video from local webcam. Not very helpful even gstreamer-devel forum. Maybe, my questions appeared too trivial to them to respond. I had to undergo try-and-fail trail on my own. Accomplished. I am posting here the full gst-launch pipeline to save some hair if somebody interested.


DEST=remote_machine_address
gst-launch -v \
gstrtpbin name=rtpbin \
videomixer name=mix sink_0::zorder=200 sink_1::zorder=100 sink_0::xpos=524 sink_0::ypos=20 sink_2:xpos=524 sink_2::ypos=100 sink_2::zorder=300 ! xvimagesink \
v4l2src ! video/x-raw-yuv, framerate=30/1, width=320, height=240 ! tee name=t ! queue ! videoscale ! videorate ! "video/x-raw-yuv,width=352,height=288,framerate=30/1" ! ffenc_h263 ! rtph263pay ! rtpbin.send_rtp_sink_0 \
rtpbin.send_rtp_src_0 ! udpsink host=$DEST port=5000 sync=false async=false \
rtpbin.send_rtcp_src_0 ! udpsink host=$DEST port=5001 sync=false async=false    \
udpsrc port=5005 ! rtpbin.recv_rtcp_sink_0 \
alsasrc ! audioconvert ! amrnbenc ! rtpamrpay ! rtpbin.send_rtp_sink_1 \
rtpbin.send_rtp_src_1 ! udpsink host=$DEST port=5002 sync=false async=false \
rtpbin.send_rtcp_src_1 ! udpsink host=$DEST port=5003 sync=false async=false \
udpsrc port=5007 ! rtpbin.recv_rtcp_sink_1 \
udpsrc caps="application/x-rtp,media=(string)audio,clock-rate=(int)8000,encoding-name=(string)AMR,encoding-params=(string)1,octet-align=(string)1" port=5002 ! rtpbin.recv_rtp_sink_3 \
udpsrc port=5003 ! rtpbin.recv_rtcp_sink_3 \
rtpbin.send_rtcp_src_3 ! udpsink host=$DEST port=5007 sync=false async=false \
rtpbin. ! rtpamrdepay ! amrnbdec ! alsasink \
udpsrc caps="application/x-rtp,media=(string)video,clock-rate=(int)90000,encoding-name=(string)H263" port=5000 ! rtpbin.recv_rtp_sink_2 \
udpsrc port=5001 ! rtpbin.recv_rtcp_sink_2 \
rtpbin.send_rtcp_src_2 ! udpsink host=$DEST port=5005 sync=false async=false        \
rtpbin. ! rtph263depay ! ffdec_h263 ! videoscale ! video/x-raw-yuv, width=704, height=576 ! textoverlay font-desc="Sans 18" text="REMOTE" valign=top halign=left shaded-background=true ! videobox border-alpha=1 top=1 left=1 right=1 bottom=1 ! videorate ! mix. \
t. ! videoscale ! videorate ! video/x-raw-yuv, framerate=1/1, width=160, height=120 ! videoflip method=horizontal-flip ! videobox border-alpha=0.25 top=-5 left=-5 right=-5 bottom=-5 ! queue ! mix.


Code snippet above is using H.263 codec for video and AMR codec for audio signal. There are 4 RTP sessions, single rtpbin and videomixer . Should be easy to adapt for other codecs supported by gstreamer.

What made me to post this is however something else. I feel like back in time when I was 6 and built my first telephone from one room to another using just spare old telephone microphone modules, 4.5V battery and long wires. I could chat with my brother from room to room remotely. What a excitement ! Today, 30 years later I am doing the same when tested today with my wife gstreamer-based RTP video chat. Technology however has changed. Instead analogue amplitude-modulated voice transfer, I am using packet-switched network, digitized signal processing with complex yet miniature CCD camera. Transferred wirelessly in gigahertz band. Amazing. Let's see what we will meet in next 30 years !

Followers

About Me

Peter is a technology enthusiast excited about internet and telecommunication technology which brings new quality of life to our everyday lives. 10+ years software 'veteran', former Oskar, Vodafone and IBM employee in area of service delivery platforms design and development. Started writing his first Basic code at age of 12, through Java, Enterprise Java and Android, Peter is now exploring ways how to bring all of these power toys to the real work.