Now that we have begun to send out invites to BitTorrent Live broadcaster applicants, it seemed like a great time to begin a series of technical articles discussing the product. We’re taking slow steps toward completely opening the floodgates, trying to make sure that we have a usable, rock solid product once we release it to the world. Right now, you can enjoy the channels of our first broadcasters without an invite.
Meanwhile, expect to see explanations of how Live operates internally, as well as tutorials on how to put it to use, here on the BitTorrent Engineering Blog.
The development of BitTorrent Live has consisted of work in essentially two camps. First is the Python-based ‘core’, which handles the actual functioning of the protocol, as well as the underlying cryptography used when transmitting and receiving data. Second is the ‘app’, which consists of a tight layer around the core, providing a command line interface, as well as a layer beyond that which provides a Windows, Mac, or Linux GUI.
As the core has become more stable over time, and major changes to the protocol less frequent, providing features like Local Peer Discovery has become a possibility. It is present in all major BitTorrent clients, and provides great performance and efficiency gains. Live is an entirely new protocol, written from the ground up, so adding support required a bit of planning.
Local Peer Discovery is a mechanism by which peer-to-peer (P2P) applications can discover other clients consuming identical content within a local network. Communication can take place far more efficiently between two machines on a LAN than with machines out on the internet, and as such, it makes sense to prioritize data transfer between two local machines whenever possible. Doing so can reduce load on the outbound link to the internet as well.
BitTorrent Live streams video between clients in real-time, and thus requires a generally higher performance network than the original BitTorrent, which has the luxury of being able to send or receive data at slower than real-time when necessary. The speed of the incoming data to Live (from all peers) cannot be less than the bitrate of the video being played. Preferably, there should also be an amount of overhead (excess network capacity) to allow for any lost or damaged data.
Due to these relatively demanding requirements, Local Peer Discovery is particularly attractive option for Live. Effectively any number of machines would be able to watch on a local network, while only one copy of the playing video would need to enter the network from the outside. Even in less extreme circumstances, playback performance would benefit from having local, low-latency peers.
UDP Multicast for Local Peer Discovery
Local Peer Discovery is typically implemented through UDP Multicast, in common with (and similarly to) Bonjour/ZeroConf. UDP is an alternative transport protocol to TCP, allowing a ‘fire-and-forget’ approach to transmitting data. Packets are not guaranteed to arrive at their destination, but in return initial handshakes and maintenance of active connections are not required. Generally, use of UDP is desirable when sending ‘unimportant’ data, or when speed of delivery is more important than integrity of the data. BitTorrent Live uses UDP for nearly all of its communications, in the interest of minimizing latency and being able to rapidly hop between peers without constructing and maintaining long-lived connections with each.
Multicast is one of three commonly utilized IP routing schemes:
- The first, unicast, is the most common. It provides a one-to-one link between two IP addresses.
- Second, broadcast, transmits from one IP to all IPs on a local network. Broadcast is generally frowned upon by network administrators because it creates a large amount of traffic that may be sent to machines that don’t need to receive it.
- Third is multicast, which transmits from one IP to any number of IPs that have requested the data. The transmission is repeated to the many subscribers at the router level, meaning that the transmitter only has to send it once. The sender and the receivers don’t need to know each other’s IP addresses, they just need to agree on a multicast IP address (explained shortly) and a port.
A set range of IP addresses are designated for multicast use. 18.104.22.168 through 22.214.171.124 are the full range, though some blocks within that are reserved. For local network use, an address within 126.96.36.199 – 188.8.131.52 should be chosen to be your multicast group. Selection is somewhat arbitrary. Just choose one that you like, and you’ll discover soon enough if another application is already using it.
The peer discovery communication in BitTorrent Live works as follows:
- Figure out your local IP address (A little convoluted to do in a reliable way within Python)
- Create a new UDP socket that subscribes to a specific multicast group (we use this for sending and receiving, because it’s convenient)
- Bind the socket to a network interface (which interface varies by OS)
- When a video channel is joined, announce via our socket to the group, “Hey! I’m SOMEIDENTIFIER, watching SOMECHANNEL!”. Periodically repeat the message.
- When you receive an announcement, check to make sure it’s a different client, not your own message being echoed back to you by the router. If it’s someone else who is watching a channel in common with you, add them to your list of peers.
- Request a full list of peers from the newly discovered peer via unicast (a normal connection, not multicast)
Implementation in Python
Use of UDP Multicast in Python (2.7 in this case) is generally straightforward, though there is an OS specific gotcha:
- As mentioned in the comments within create_socket() below, binding has to be done differently on Mac vs Windows. OSX (for unknown reasons) will not receive any multicast data if bound to a specific interface, so it has to use ‘0.0.0.0’. On the other hand, Windows must be bound to a specific interface.
Now, some code. It should be somewhat more robust than many examples online, and should work on Windows, Mac, and Linux. This example is not multithreaded. I recommend using an event-based library for your network programming, such as Gevent or Twisted.
Because it just uses simple while loops, it can only operate in listen OR announce mode at any given time. You can start up multiple instances of it, though, to have a full demonstration on a single machine.
In a real application using one of the aforementioned event-based libs, you will be able to both send and receive using the same socket created in create_socket(). Because the example script structure can’t do both at the same time, I offset the bound port by one in announce mode so that you can run the script twice, once for each mode.
The full code is also available as a gist.
# Written by Aaron Cohen -- 1/14/2013
# Brought to you by BitTorrent, Inc.
# "We're not just two guys in a basement in Sweden." TM
# This work is licensed under a Creative Commons Attribution 3.0 Unported License.
# See: http://creativecommons.org/licenses/by/3.0/
Uses a regex to determine if the input ip is on a local network. Returns a boolean. It's safe here, but never use a regex for IP verification if from a potentially dangerous source.
combined_regex = "(^10\.)|(^172\.1[6-9]\.)|(^172\.2[0-9]\.)|(^172\.3[0-1]\.)|(^192\.168\.)"
return re.match(combined_regex, ip_string) is not None # is not None is just a sneaky way of converting to a boolean
Returns the first externally facing local IP address that it can find.
Even though it's longer, this method is preferable to calling socket.gethostbyname(socket.gethostname()) as
socket.gethostbyname() is deprecated. This also can discover multiple available IPs with minor modification.
We exclude 127.0.0.1 if possible, because we're looking for real interfaces, not loopback.
Some linuxes always returns 127.0.1.1, which we don't match as a local IP when checked with ip_is_local().
We then fall back to the uglier method of connecting to another server.
# socket.getaddrinfo returns a bunch of info, so we just get the IPs it returns with this list comprehension.
local_ips = [ x for x in socket.getaddrinfo(socket.gethostname(), 80)
if ip_is_local(x) ]
# select the first IP, if there is one.
local_ip = local_ips if len(local_ips) > 0 else None
# If the previous method didn't find anything, use this less desirable method that lets your OS figure out which
# interface to use.
if not local_ip:
# create a standard UDP socket ( SOCK_DGRAM is UDP, SOCK_STREAM is TCP )
temp_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# Open a connection to one of Google's DNS servers. Preferably change this to a server in your control.
# Get the interface used by the socket.
local_ip = temp_socket.getsockname()
# Only return 127.0.0.1 if nothing else has been found.
local_ip = "127.0.0.1"
# Always dispose of sockets when you're done!
def create_socket(multicast_ip, port):
Creates a socket, sets the necessary options on it, then binds it. The socket is then returned for use.
local_ip = get_local_ip()
# create a UDP socket
my_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# allow reuse of addresses
my_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# set multicast interface to local_ip
my_socket.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF, socket.inet_aton(local_ip))
# Set multicast time-to-live to 2...should keep our multicast packets from escaping the local network
my_socket.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_TTL, 2)
# Construct a membership request...tells router what multicast group we want to subscribe to
membership_request = socket.inet_aton(multicast_ip) + socket.inet_aton(local_ip)
# Send add membership request to socket
# See http://www.tldp.org/HOWTO/Multicast-HOWTO-6.html for explanation of sockopts
my_socket.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, membership_request)
# Bind the socket to an interface.
# If you bind to a specific interface on the Mac, no multicast data will arrive.
# If you try to bind to all interfaces on Windows, no multicast data will arrive.
# Hence the following.
Returns the IP address (probably your local IP) that the socket is bound to for multicast.
Note that this may not be the same address you bound to manually if you specified 0.0.0.0.
This isn't used here, just a useful utility method.
response = my_socket.getsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF)
def drop_multicast_membership(my_socket, multicast_ip):
Drops membership to the specified multicast group without closing the socket.
Note that this happens automatically (done by the kernel) if the socket is closed.
local_ip = get_local_ip()
# Must reconstruct the same request used when adding the membership initially
membership_request = socket.inet_aton(multicast_ip) + socket.inet_aton(local_ip)
# Leave group
my_socket.setsockopt(socket.IPPROTO_IP, socket.IP_DROP_MEMBERSHIP, membership_request)
def listen_loop(multicast_ip, port):
my_socket = create_socket(multicast_ip, port)
# Data waits on socket buffer until we retrieve it.
# NOTE: Normally, you would want to compare the incoming data's source address to your own, and filter it out
# if it came from the current machine. Everything you send gets echoed back at you if your socket is
# subscribed to the multicast group.
data, address = my_socket.recvfrom(4096)
print "%s says the time is %s" % (address, data)
def announce_loop(multicast_ip, port):
# Offset the port by one so that we can send and receive on the same machine
my_socket = create_socket(multicast_ip, port + 1)
# NOTE: Announcing every second, as this loop does, is WAY aggressive. 30 - 60 seconds is usually
# plenty frequent for most purposes.
# Just sending Unix time as a message
message = str(time.time())
# Send data. Destination must be a tuple containing the ip and port.
my_socket.sendto(message, (multicast_ip, port))
if __name__ == '__main__':
# Choose an arbitrary multicast IP and port.
# 184.108.40.206 - 220.127.116.11 are for local network multicast use.
# Remember, you subscribe to a multicast IP, not a port. All data from all ports
# sent to that multicast IP will be echoed to any subscribed machine.
multicast_address = "18.104.22.168"
multicast_port = 1234
# When launching this example, you can choose to put it in listen or announce mode.
# Announcing doesn't require binding to a port, but we do it here just to reuse code.
# It binds to the requested port + 1, allowing you to run the announce and listen modes
# on the same machine at the same time.
# In a real case, you'll most likely send and receive from the same port using Gevent or Twisted,
# so the code in create_socket() will apply more directly.
if sys.argv == "listen":
elif sys.argv == "announce":
exit("Run 'multicast_example.py listen' or 'multicast_example.py announce'.")
Additional Reading Material
Changing or reading settings with socket.setsockopt() or socket.getsockopt() can seem like a bit of a dark art, because the flags aren’t documented anywhere in the main Python docs. A list of the ones supported in Python can be found in the Python source code.
Also useful is the Linux Documentation Project’s writeup on multicast socket flags.
Hopefully this example will help shed some more light on how to build a robust multicast Python application. While not a complete implementation, the needed building blocks should be present. If you have any questions, comments, or corrections, please feel free to post them below. I’ll do my best to stay on top of them and pipe in when necessary.