After studying this section you should be able to do the following:
- Understand the layers that make up the Internet—application protocol, transmission control protocol, and Internet protocol—and describe why each is important.
- Discuss the benefits of Internet architecture in general and TCP/IP in particular.
- Name applications that should use TCP and others that might use UDP.
- Understand what a router does and the role these devices play in networking.
- Conduct a traceroute and discuss the output, demonstrating how Internet interconnections work in getting messages from point to point.
- Understand why mastery of Internet infrastructure is critical to modern finance and be able to discuss the risks in automated trading systems.
- Describe VoIP, and contrast circuit versus packet switching, along with organizational benefits and limitations of each.
TCP/IP: The Internet’s Secret Sauce
OK, we know how to read a Web address, we know that every device connected to the Net needs an IP address, and we know that the DNS can look at a Web address and find the IP address of the machine that you want to communicate with. But how does a Web page, an e-mail, or an iTunes download actually get from a remote computer to your desktop?
For our next part of the Internet journey, we’ll learn about two additional protocols: TCP and IP. These protocols are often written as TCP/IP and pronounced by reading all five letters in a row, “T-C-P-I-P” (sometimes they’re also referred to as the Internet protocol suite). TCP and IP are built into any device that a user would use to connect to the Internet—from handhelds to desktops to supercomputers—and together TCP/IP make Internet working happen.
TCP and IP operate below http and the other application transfer protocols mentioned earlier. TCP (transmission control protocol) works its magic at the start and endpoint of the trip—on both your computer and on the destination computer you’re communicating with. Let’s say a Web server wants to send you a large Web page. The Web server application hands the Web page it wants to send to its own version of TCP. TCP then slices up the Web page into smaller chunks of data called packets (or datagrams). The packets are like little envelopes containing part of the entire transmission—they’re labeled with a destination address (where it’s going) and a source address (where it came from). Now we’ll leave TCP for a second, because TCP on the Web server then hands those packets off to the second half of our dynamic duo, IP.
It’s the job of IP (Internet protocol) to route the packets to their final destination, and those packets might have to travel over several networks to get to where they’re going. The relay work is done via special computers called routers, and these routers speak to each other and to other computers using IP (since routers are connected to the Internet, they have IP addresses, too. Some are even named). Every computer on the Internet is connected to a router, and all routers are connected to at least one (and usually more than one) other router, linking up the networks that make up the Internet.
Routers don’t have perfect, end-to-end information on all points in the Internet, but they do talk to each other all the time, so a router has a pretty good idea of where to send a packet to get it closer to where it needs to end up. This chatter between the routers also keeps the Internet decentralized and fault-tolerant. Even if one path out of a router goes down (a networking cable gets cut, a router breaks, the power to a router goes out), as long as there’s another connection out of that router, then your packet will get forwarded. Networks fail, so good, fault-tolerant network design involves having alternate paths into and out of a network.
Once packets are received by the destination computer (your computer in our example), that machine’s version of TCP kicks in. TCP checks that it has all the packets, makes sure that no packets were damaged or corrupted, requests replacement packets (if needed), and then puts the packets in the correct order, passing a perfect copy of your transmission to the program you’re communicating with (an e-mail server, Web server, etc.).
This progression—application at the source to TCP at the source (slice up the data being sent), to IP (for forwarding among routers), to TCP at the destination (put the transmission back together and make sure it’s perfect), to application at the destination—takes place in both directions, starting at the server for messages coming to you, and starting on your computer when you’re sending messages to another computer.
UDP: TCP’s Faster, Less Reliable Sibling
TCP is a perfectionist and that’s what you want for Web transmissions, e-mail, and application downloads. But sometimes we’re willing to sacrifice perfection for speed. You’d make this sacrifice for streaming media applications like Windows Media Player, Real Player, Internet voice chat, and video conferencing. Having to wait to make sure each packet is perfectly sent would otherwise lead to awkward pauses that interrupt real-time listening. It’d be better to just grab the packets as they come and play them, even if they have minor errors. Packets are small enough that if one packet doesn’t arrive, you can ignore it and move on to the next without too much quality disruption. A protocol called UDP (user datagram protocol) does exactly this, working as a TCP stand-in when you’ve got the need for speed, and are willing to sacrifice quality. If you’ve ever watched a Web video or had a Web-based phone call and the quality got sketchy, it’s probably because there were packet problems, but UDP kept on chugging, making the “get it fast” instead of “get it perfect” trade-off.
VoIP: When Phone Calls Are Just Another Internet Application
The increasing speed and reliability of the Internet means that applications such as Internet phone calls (referred to as VoIP, or voice over Internet protocol) are becoming more reliable. That doesn’t just mean that Skype becomes a more viable alternative for consumer landline and mobile phone calls; it’s also good news for many businesses, governments, and nonprofits.
Many large organizations maintain two networks—one for data and another for POTS (plain old telephone service). Maintaining two networks is expensive, and while conventional phone calls are usually of a higher quality than their Internet counterparts, POTS equipment is also inefficient. Old phone systems use a technology called circuit switching. A “circuit” is a dedicated connection between two entities. When you have a POTS phone call, a circuit is open, dedicating a specific amount of capacity between you and the party on the other end. You’re using that “circuit” regardless of whether you’re talking. Pause between words or put someone on hold, and the circuit is still in use. Anyone who has ever tried to make a phone call at a busy time (say, early morning on Mother’s Day or at midnight on New Year’s Eve) and received an “all circuits are busy” recording has experienced congestion on an inefficient circuit-switched phone network.
But unlike circuit-switched counterparts, Internet networks are packet-switched networks, which can be more efficient. Since we can slice conversations up into packets, we can squeeze them into smaller spaces. If there are pauses in a conversation or someone’s on hold, applications don’t hold up the network. And that creates an opportunity to use the network’s available capacity for other users. The trade-off is one that swaps circuit switching’s quality of service (QoS) with packet switching’s efficiency and cost savings. Try to have a VoIP call when there’s too much traffic on a portion of the network and your call quality will drop. But packet switching quality is getting much better. Networking standards are now offering special features, such as “packet prioritization,” that can allow voice packets to gain delivery priority over packets for applications like e-mail, where a slight delay is OK.
When voice is digitized, “telephone service” simply becomes another application that sits on top of the Internet, like the Web, e-mail, or FTP. VoIP calls between remote offices can save long distance charges. And when the phone system becomes a computer application, you can do a lot more. Well-implemented VoIP systems allow users’ browsers access to their voice mail inbox, one-click video conferencing and call forwarding, point-and-click conference call setup, and other features, but you’ll still have a phone number, just like with POTS.
What Connects the Routers and Computers?
Routers are connected together, either via cables or wirelessly. A cable connecting a computer in a home or office is probably copper (likely what’s usually called an Ethernet cable), with transmissions sent through the copper via electricity. Long-haul cables, those that carry lots of data over long distances, are usually fiber-optic lines—glass lined cables that transmit light (light is faster and travels farther distances than electricity, but fiber-optic networking equipment is more expensive than the copper-electricity kind). Wireless transmission can happen via Wi-Fi (for shorter distances), or cell phone tower or satellite over longer distances. But the beauty of the Internet protocol suite (TCP/IP) is that it doesn’t matter what the actual transmission media are. As long as your routing equipment can connect any two networks, and as long as that equipment “speaks” IP, then you can be part of the Internet.
In reality, your messages likely transfer via lots of different transmission media to get to their final destination. If you use a laptop connected via Wi-Fi, then that wireless connection finds a base station, usually within about three hundred feet. That base station is probably connected to a local area network (LAN) via a copper cable. And your firm or college may connect to fast, long-haul portions of the Internet via fiber-optic cables provided by that firm’s Internet service provider (ISP).
Most big organizations have multiple ISPs for redundancy, providing multiple paths in and out of a network. This is so that if a network connection provided by one firm goes down, say an errant backhoe cuts a cable, other connections can route around the problem (see Figure 12.1).
In the United States (and in most deregulated telecommunications markets), Internet service providers come in all sizes, from smaller regional players to sprawling international firms. When different ISPs connect their networking equipment together to share traffic, it’s called peering. Peering usually takes place at neutral sites called Internet exchange points (IXPs), although some firms also have private peering points. Carriers usually don’t charge one another for peering. Instead, “the money is made” in the ISP business by charging the end-points in a network—the customer organizations and end users that an ISP connects to the Internet. Competition among carriers helps keep prices down, quality high, and innovation moving forward.
Finance Has a Need for Speed
When many folks think of Wall Street trading, they think of the open outcry pit at the New York Stock Exchange (NYSE). But human traders are just too slow for many of the most active trading firms. Over half of all U.S. stock trades and a quarter of worldwide currency trades now happen via programs that make trading decisions without any human intervention (Timmons, 2006). There are many names for this automated, data-driven frontier of finance—algorithmic trading, black-box trading, or high-frequency trading. And while firms specializing in automated, high-frequency trading represent only about 2 percent of the trading firms operating in the United States, they account for about three quarters of all U.S. equity trading volume (Iati, 2009).
Programmers lie at the heart of modern finance. “A geek who writes code—those guys are now the valuable guys” says the former head of markets systems at Fidelity Investments, and that rare breed of top programmer can make “tens of millions of dollars” developing these systems (Berenson, 2009). Such systems leverage data mining and other model-building techniques to crunch massive volumes of data and discover exploitable market patterns. Models are then run against real-time data and executed the instant a trading opportunity is detected. (For more details on how data is gathered and models are built, see Chapter 11 “The Data Asset: Databases, Business Intelligence, and Competitive Advantage”.)
Winning with these systems means being quick—very quick. Suffer delay (what techies call latency) and you may have missed your opportunity to pounce on a signal or market imperfection. To cut latency, many trading firms are moving their servers out of their own data centers and into colocation facilities. These facilities act as storage places where a firm’s servers get superfast connections as close to the action as possible. And by renting space in a “colo,” a firm gets someone else to manage the electrical and cooling issues, often providing more robust power backup and lower energy costs than a firm might get on its own.
Equinix, a major publicly traded IXP and colocation firm with facilities worldwide, has added a growing number of high-frequency trading firms to a roster of customers that includes e-commerce, Internet, software, and telecom companies. In northern New Jersey alone (the location of many of the servers where “Wall Street” trading takes place), Equinix hosts some eighteen exchanges and trading platforms as well as the NYSE Secure Financial Transaction Infrastructure (SFTI) access node.
Less than a decade ago, eighty milliseconds was acceptably low latency, but now trading firms are pushing below one millisecond into microseconds (Schmerken, 2009). So it’s pretty clear that understanding how the Internet works, and how to best exploit it, is of fundamental and strategic importance to those in finance. But also recognize that this kind of automated trading comes with risks. Systems that run on their own can move many billions in the blink of an eye, and the actions of one system may cascade, triggering actions by others.
The spring 2010 “Flash Crash” resulted in a nearly 1,000-point freefall in the Dow Jones Industrial Index, it’s biggest intraday drop ever. Those black boxes can be mysterious—weeks after the May 6th event, experts were still parsing through trading records, trying to unearth how the flash crash happened (Daimler & Davis, 2010). Regulators and lawmakers recognize they now need to understand technology, telecommunications, and its broader impact on society so that they can create platforms that fuel growth without putting the economy at risk.
Watching the Packet Path via Traceroute
Want to see how packets bounce from router to router as they travel around the Internet? Check out a tool called traceroute. Traceroute repeatedly sends a cluster of three packets starting at the first router connected to a computer, then the next, and so on, building out the path that packets take to their destination.
Traceroute is built into all major desktop operating systems (Windows, Macs, Linux), and several Web sites will run traceroute between locations (traceroute.org and visualroute.visualware.com are great places to explore).
The message below shows a traceroute performed between Irish firm VistaTEC and Boston College. At first, it looks like a bunch of gibberish, but if we look closely, we can decipher what’s going on.
The table above shows ten hops, starting at a domain in vistatec.ie and ending in 18.104.22.168 (the table doesn’t say this, but all IP addresses starting with 136.167 are Boston College addresses). The three groups of numbers at the end of three lines shows the time (in milliseconds) of three packets sent out to test that hop of our journey. These numbers might be interesting for network administrators trying to diagnose speed issues, but we’ll ignore them and focus on how packets get from point to point.
At the start of each line is the name of the computer or router that is relaying packets for that leg of the journey. Sometimes routers are named, and sometimes they’re just IP addresses. When routers are named, we can tell what network a packet is on by looking at the domain name. By looking at the router names to the left of each line in the traceroute above, we see that the first two hops are within the vistatec.ie network. Hop 3 shows the first router outside the vistatec.ie network. It’s at a domain named tinet.net, so this must be the name of VistaTEC’s Internet service provider since it’s the first connection outside the vistatec.ie network.
Sometimes routers names suggest their locations (oftentimes they use the same three character abbreviations you’d see in airports). Look closely at the hosts in hops 3 through 7. The subdomains dub20, lon11, lon01, jfk02, and bos01 suggest the packets are going from Dublin, then east to London, then west to New York City (John F. Kennedy International Airport), then north to Boston. That’s a long way to travel in a fraction of a second!
Hop 4 is at tinet.net, but hop 5 is at cogentco.com (look them up online and you’ll find out that cogentco.com, like tinet.net, is also an ISP). That suggests that between those hops peering is taking place and traffic is handed off from carrier to carrier.
Hop 8 is still cogentco.com, but it’s not clear who the unnamed router in hop 9, 22.214.171.124, belongs to. We can use the Internet to sleuth that out, too. Search the Internet for the phrase “IP address lookup” and you’ll find a bunch of tools to track down the organization that “owns” an IP address. Using the tool at whatismyip.com, I found that this number is registered to PSI Net, which is now part of cogentco.com.
Routing paths, ISPs, and peering all revealed via traceroute. You’ve just performed a sort of network “CAT scan” and looked into the veins and arteries that make up a portion of the Internet. Pretty cool!
If you try out traceroute on your own, be aware that not all routers and networks are traceroute friendly. It’s possible that as your trace hits some hops along the way (particularly at the start or end of your journey), three “*” characters will show up at the end of each line instead of the numbers indicating packet speed. This indicates that traceroute has timed out on that hop. Some networks block traceroute because hackers have used the tool to probe a network to figure out how to attack an organization. Most of the time, though, the hops between the source and destination of the traceroute (the steps involving all the ISPs and their routers) are visible.
Traceroute can be a neat way to explore how the Internet works and reinforce the topics we’ve just learned. Search for traceroute tools online or browse the Internet for details on how to use the traceroute command built into your computer.
There’s Another Internet?
If you’re a student at a large research university, there’s a good chance that your school is part of Internet2. Internet2 is a research network created by a consortium of research, academic, industry, and government firms. These organizations have collectively set up a high-performance network running at speeds of up to one hundred gigabits per second to support and experiment with demanding applications. Examples include high-quality video conferencing; high-reliability, high-bandwidth imaging for the medical field; and applications that share huge data sets among researchers.
If your university is an Internet2 member and you’re communicating with another computer that’s part of the Internet2 consortium, then your organization’s routers are smart enough to route traffic through the superfast Internet2 backbone. If that’s the case, you’re likely already using Internet2 without even knowing it!
- TCP/IP, or the Internet protocol suite, helps get perfect copies of Internet transmissions from one location to another. TCP works on the ends of transmission, breaking up transmissions up into manageable packets at the start and putting them back together while checking quality at the end. IP works in the middle, routing packets to their destination.
- Routers are special computing devices that forward packets from one location to the next. Routers are typically connected with more than one outbound path, so in case one path becomes unavailable, an alternate path can be used.
- UDP is a replacement for TCP, used when it makes sense to sacrifice packet quality for delivery speed. It’s often used for media streaming.
- TCP/IP doesn’t care about the transition media. This allows networks of different types—copper, fiber, and wireless—to connect to and participate in the Internet.
- The ability to swap in new applications, protocols, and media files gives the network tremendous flexibility.
- Decentralization, fault tolerance, and redundancy help keep the network open and reliable.
- VoIP allows voice and phone systems to become an application traveling over the Internet. This is allowing many firms to save money on phone calls and through the elimination of old, inefficient circuit-switched networks. As Internet applications, VoIP phone systems can also have additional features that circuit-switched networks lack. The primary limitation of many VoIP systems is quality of service.
- Many firms in the finance industry have developed automated trading models that analyze data and execute trades without human intervention. Speeds substantially less than one second may be vital to capitalizing on market opportunities, so firms are increasingly moving equipment into collocation facilities that provide high-speed connectivity to other trading systems.
Questions and Exercises
- How can the Internet consist of networks of such physically different transmission media—cable, fiber, and wireless?
- What is the difference between TCP and UDP? Why would you use one over the other?
- Would you recommend a VoIP phone system to your firm or University? Why or why not? What are the advantages? What are the disadvantages? Can you think of possible concerns or benefits not mentioned in this section? Research these concerns online and share your finding with your instructor.
- What are the risks in the kinds of automated trading systems described in this section? Conduct research and find an example of where these systems have caused problems for firms and/or the broader market. What can be done to prevent such problems? Whose responsibility is this?
- Search the Internet for a traceroute tool, or look online to figure out how to use the traceroute command built into your PC. Run three or more traceroutes to different firms at different locations around the world. List the number of ISPs that show up in the trace. Circle the areas where peering occurs. Do some of the “hops” time out with “*” values returned? If so, why do you think that happened?
- Find out if your school or employer is an Internet2 member. If it is, run traceroutes to schools that are and are not members of Internet2. What differences do you see in the results?