Writing network programs under Linux

Linux is a network operating system. Even some system services run as network servers, providing access to outside machines. Without unnecessary going into theory of ports, protocols and other network details, I'll show you how to write a simple server and a client for it.

Communications in a network uses many different elements. The basic concept is a socket). A socket is a logical (not existing physically) device, the basic gate through which information flows. To create a socket, use the socket function (from the C library, like all the following). It takes 3 parameters (read: man 2 socket):

  1. domain - tells about the connection type. We're going to use th value PF_INET=2, which stands for the internet protocols (IPv4)
  2. type of socket - tells if the socket uses datagrams, streams or is a raw socket. We're going to use a stream socket SOCK_STREAM=1 and the TCP (Transmission Control Protocol) protocol, which guarantees a reliable, two-way connection.
  3. the protocol, if it is ambiguous. Here, TCP is determined by the socket type (stream socket), so this parameter has the value 0.

If creating the socket failed, the socket function will return -1. If it succeeds, the function will return a positive integer - a descriptor of an open socket (just like with files). After finishing the transmission, the socket can be closed using the close function.

After creating the sockets, the codes for the server and client differ, so they need to be described separately.



The server code


(skip the server description)

As we know, a server's job is to listen for connections from clients. To achieve that, the following steps have to be taken.

  1. Assign the socket to an address.

    After creation, a socket is not assigned to any address, but we need to specify which address and port we're going to listen on. To do that, use the bind function. It takes the following parameters (see: man 2 bind):

    1. a socket created with the socket function
    2. address of a sockaddr structure, described below
    3. length of the structure in parameter 2

    Although the declaration of the bind function tells about a sockaddr structure, it is common to pass a properly cast address of a sockaddr_in structure instead. This structure looks like this:


    (skip the sockaddr_in structure)
    	  struc sockaddr_in
    		.sin_family resw 1	; address family
    		.sin_port   resw 1	; port number
    		.sin_addr   resd 1	; listen address
    		            resb 8	; fill to 16 bytes
    	  endstruc

    Put AF_INET=2 into the address family field, meaning the internet protocols family.

    The sin_port field should contain the port number our server will listen on. But be warned - do not put the port number directly in this field. Before doing that, the port number must be translated to the network byte order using the htons function (see: man htons). This function accepts one parameter - the port number. Fill the sin_port with the result of this function. Programs running without root privileges can use ports only with a number higher than 1023.

    Put the value INADDR_ANY=0 into the sin_addr field, telling that the server will listen on any address.

    In case of error, bind returns -1.

  2. Enabling the server to listen on the given socket.

    To turn the listening on for a socket, use the listen function. It takes two parameters (see: man 2 listen):

    1. a socket created with the socket function with an address assigned to it using bind
    2. maximum number of clients waiting in a queue for service

    In case of error, listen returns -1.

    If the listen succeeds, you can put the server in daemon mode (described in the tutorial about writing resident programs).

After enabling listening on a port, the server can accept connections. To accept a connection, use the accept function. It takes three parameters (see: man 2 accept):

  1. a socket in listen mode
  2. a zero or an address of a sockaddr structure (or the same sockaddr_in structure used for the bind function). This structure will be filled with client information (like its IP address)
  3. address of a variable containing the length of the structure given in the second parameter

When the client has connected, accept returns a descriptor for a new socket which can be used to communicate with that client.



The client code


(skip the client description)

Less work is required to create the client compared to the server. After creating the socket, just one function is needed to connect to the server - connect. It takes three parameters (see: man 2 connect):

  1. a socket created with the socket function
  2. address of a sockaddr structure
  3. length of the structure

Instead passing a sockaddr structure address, we pass a address of a sockaddr_in structure here, too. But it needs to be filled in a different way.

The sin_family and sin_port are filled just like for bind. We do want to connect to the port the server is listening on.

The sin_addr field must be filled with the address of the server. Not directly with a string of characters, of course, but processed the right way. To process a string containing 127.0.0.1 (meaning always the current machine to itself) to the right form, use the inet_aton. It takes two parameters (see: man inet_aton):

  1. address of a character string containing the address in decimal dotted notation (ttt.xxx.yyy.zzz)
  2. address of an in_addr structure, which will get the result

The in_addr structure is the only field of the sin_addr field in our sockaddr_in structure, so the address of this field should be passes to inet_aton.

After successful connection establishment using the connect function, you can start exchanging the data.



Data exchange


(skip data exchange)

After connecting, both sides - the client and the server - have open sockets they can use to communicate. To exchange data, use the two basic functions: send and recv. Both take exactly the same four parameters (see: man 2 send, man 2 recv):

  1. a socket connected to the client/server
  2. address of a transmission/receive buffer
  3. length of the buffer
  4. special flags, if there's need. Zero will do for us.


Example

After going through the hard theory we can start writing programs. I realize that theory on its own won't let anyone write a client and server program immediately (there are a few traps you must watch out for), so I'll give an example server and client here (NASM syntax).

The server:


(skip server code)
; Server program
;
; author: Bogdan D., bogdandr (at) op.pl
;
; assembly:
; nasm -O999 -f elf -o server.o server.asm
; gcc -o server server.o

section	.text
global	main		; we're going to use the C library, so our main
			; function must be called "main"

; definitions of a few useful constants
%define PF_INET		2
%define AF_INET		PF_INET
%define SOCK_STREAM	1
%define INADDR_ANY	0

%define	NPORTU		4242
%define	MAXKLIENT	5	; maximum number of clients

; external C library functions we're going to use
extern	daemon
extern	socket
extern	listen
extern	accept
extern	bind
extern	htons
extern	recv
extern	send
extern	close

main:
	push	dword 0
	push	dword SOCK_STREAM
	push	dword AF_INET
	call	socket			; create a socket:
					;socket(AF_INET,SOCK_STREAM,0);
	add	esp, 12			; remove parameters from the stack

	cmp	eax, 0			; EAX < 0 means error
	jl	.sock_err

	mov	[gniazdo], eax		; save the socket descriptor

	push	word NPORTU
	call	htons			; convert port number to correct form
					; htons(NPORTU);
	add	esp, 2

			; fill the converted port number:
	mov	[adres+sockaddr_in.sin_port], ax
			; internet address family:
	mov	word [adres+sockaddr_in.sin_family], AF_INET
			; accept any address
	mov	dword [adres+sockaddr_in.sin_addr], INADDR_ANY

	push	dword sockaddr_in_size
	push	dword adres
	push	dword [gniazdo]
	call	bind			; assign the socket to an address:
				; bind(gniazdo,&adres,sizeof(adres));
	add	esp, 12

	cmp	eax, 0
	jl	.bind_err

	push	dword MAXKLIENT
	push	dword [gniazdo]
	call	listen			; enable listening:
					; listen(gniazdo,MAXKLIENT);
	add	esp, 8

	cmp	eax, 0
	jl	.list_err

	push	dword 1
	push	dword 1
	call	daemon			; go to daemon mode
	add	esp, 8			; remove parameters from the stack

	mov	dword [rozmiar], sockaddr_in_size

.czekaj:
	push	dword rozmiar		; [rozmiar] contains the size of the
					; sockaddr_in structure
	push	dword adres
	push	dword [gniazdo]
	call	accept			; wait for connections
				; accept(gniazdo,&adres,&rozmiar)
	add	esp, 12
	cmp	eax, 0
	jl	.czekaj

	mov	[gniazdo_kli], eax	; if accept succeeds, it will return
					; a new client socket

.rozmowa:
	push	dword 0
	push	dword buf_d
	push	dword bufor
	push	dword [gniazdo_kli]
	call	recv			; receive data;
			; recv(gniazdo_kli,&bufor,sizeof(bufor),0);
	add	esp, 16

	cmp	eax, 0			; if error, wait again
	jl	.rozmowa

	cmp	byte [bufor], "Q"	; set a 'Q' to finish connection
	je	.koniec

	mov	ecx, buf_d
	mov	edi, bufor
	xor	eax, eax
	cld
	rep	stosb			; empty the buffer

	push	dword 0
	push	dword 2
	push	dword ok
	push	dword [gniazdo_kli]
	call	send			; send the data
					; (respond with "OK" to anything)
					; send(gniazdo_kli,&ok,2,0);
	add	esp, 16

	jmp	.rozmowa		; and wait again

.koniec:
	push	dword 0
	push	dword buf_d
	push	dword bufor
	push	dword [gniazdo_kli]
	call	send			; send the 'Q', which is in the buffer
	add	esp, 16

	push	dword [gniazdo_kli]
	call	close			; close the client socket
	add	esp, 4

; if we want our server to listen for more connections, write:
;;; 	jmp	.czekaj
; the only way to stop a server will be to kill its process

	push	dword [gniazdo]
	call	close			; close the main server socket
	add	esp, 4

	mov	eax, 1
	xor	ebx, ebx
	int	80h			; exit the program


; obsługa błędów:

.sock_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_socket
	mov	edx, err_socket_d
	int	80h			; print a string

	mov	eax, 1
	mov	ebx, 1
	int	80h			; exit the program with the correct
					; error code

.bind_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_bind
	mov	edx, err_bind_d
	int	80h

	push	dword [gniazdo]
	call	close			; close the socket

	mov	eax, 1
	mov	ebx, 2
	int	80h

.list_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_listen
	mov	edx, err_listen_d
	int	80h

	push	dword [gniazdo]
	call	close			; close the socket

	mov	eax, 1
	mov	ebx, 3
	int	80h


section .data

					; socket descriptors:
gniazdo		dd	0
gniazdo_kli	dd	0

bufor		times	20	db	0	; send/receive buffer
buf_d		equ	$ - bufor		; length of the bbuffer

						; error messages:
err_socket	db	"Problem with socket!", 10
err_socket_d	equ	$ - err_socket

err_bind	db	"Problem with bind!", 10
err_bind_d	equ	$ - err_bind

err_listen	db	"Problem with listen!", 10
err_listen_d	equ	$ - err_listen

ok		db	"OK"		; what we send

struc sockaddr_in

	.sin_family	resw	1		; address family
	.sin_port	resw	1		; port number
	.sin_addr	resd	1		; address

			resb	8		; fill to 16 bytes
endstruc

adres		istruc	sockaddr_in		; address as a variable, which
						; is a structure
rozmiar		dd	sockaddr_in_size	; size of the structure

The client:


(skip client code)
; Client program
;
; author: Bogdan D., bogdandr (at) op.pl
;
; assembly:
; nasm -O999 -f elf -o client.o client.asm
; gcc -o client client.o

section	.text
global	main		; we're going to use the C library, so our main
			; function must be called "main"

; definitions of a few useful constants
%define PF_INET		2
%define AF_INET		PF_INET
%define SOCK_STREAM	1
%define INADDR_ANY	0

%define	NPORTU		4242

; external C library functions we're going to use
extern	socket
extern	connect
extern	htons
extern	recv
extern	send
extern	close
extern	inet_aton

main:
	push	dword 0
	push	dword SOCK_STREAM
	push	dword AF_INET
	call	socket			; create a socket:
					; socket(AF_INET,SOCK_STREAM,0);
	add	esp, 12			; remove parameters from the stack

	cmp	eax, 0			; EAX < 0 means error
	jle	.sock_err

	mov	[gniazdo], eax		; save the socket descriptor

					; internet address family:
	mov	word [adres+sockaddr_in.sin_family], AF_INET

	push	dword (adres + sockaddr_in.sin_addr)
	push	dword localhost
	call	inet_aton		; convert 127.0.0.1 to the correct
					; form
	add	esp, 8
	test	eax, eax		; EAX = 0 means the address
					; was incorrect
	jz	.inet_err

	push	word NPORTU
	call	htons			; convert the port number to the
					; correct from
	add	esp, 2
				; put the converted port number:
	mov	word [adres+sockaddr_in.sin_port], ax

	push	dword sockaddr_in_size
	push	dword adres
	push	dword [gniazdo]
	call	connect			; connect to the server:
				; connect(gniazdo,&adres,sizeof(adres));
	add	esp, 12

	cmp	eax, 0
	jne	.conn_err

.rozmowa:
	mov	eax, 3
	mov	ebx, 0
	mov	ecx, bufor
	mov	edx, buf_d
	int	80h			; read input from the standard input

	push	dword 0
	push	dword buf_d
	push	dword bufor
	push	dword [gniazdo]
	call	send			; send what we've read:
				; send(gniazdo,&bufor,sizeof(bufor),0);
	add	esp, 16

	cmp	eax, 0
	jl	.send_err

	mov	ecx, buf_d
	mov	edi, bufor
	xor	eax, eax
	cld
	rep	stosb			; empty the buffer

.odbieraj:
	push	dword 0
	push	dword buf_d
	push	dword bufor
	push	dword [gniazdo]
	call	recv			; receive data from the server:
				; recv(gniazdo,&bufor,sizeof(bufor),0);
	add	esp, 16

	cmp	eax, 0
	jl	.odbieraj

	mov	eax, 4
	mov	ebx, 1
	mov	ecx, odebrano
	mov	edx, odebrano_dl
	int	80h			; print what's received

	cmp	byte [bufor], "Q"		; "Q" ends the connection
	jne	.rozmowa

	push	dword [gniazdo]
	call	close			; close the socket
	add	esp, 4

	mov	eax, 1
	xor	ebx, ebx
	int	80h			; exit the program


; error handling section

.sock_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_socket
	mov	edx, err_socket_d
	int	80h			; print a message

	mov	eax, 1
	mov	ebx, 1
	int	80h			; exit the program with a correct
					; error code

.conn_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_connect
	mov	edx, err_connect_d
	int	80h

	push	dword [gniazdo]
	call	close			; close the socket

	mov	eax, 1
	mov	ebx, 2
	int	80h

.inet_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_inet
	mov	edx, err_inet_d
	int	80h

	push	dword [gniazdo]
	call	close			; close the socket

	mov	eax, 1
	mov	ebx, 3
	int	80h

.send_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_send
	mov	edx, err_send_d
	int	80h

	push	dword [gniazdo]
	call	close			; close the socket

	mov	eax, 1
	mov	ebx, 4
	int	80h

.recv_err:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, err_recv
	mov	edx, err_recv_d
	int	80h

	push	dword [gniazdo]
	call	close			; close the socket

	mov	eax, 1
	mov	ebx, 5
	int	80h

section .data

gniazdo		dd	0			; socket descriptor

odebrano	db	"Server: "
bufor		times	20	db	0	; send/receive buffer
buf_d		equ	$ - bufor		; buffer length
		db	10			; a newline
odebrano_dl	equ	$ - odebrano

						; error messages
err_socket	db	"Problem with socket!", 10
err_socket_d	equ	$ - err_socket

err_connect	db	"Problem with connect!", 10
err_connect_d	equ	$ - err_connect

err_inet	db	"Problem with inet_aton!", 10
err_inet_d	equ	$ - err_inet

err_send	db	"Problem with send!", 10
err_send_d	equ	$ - err_send

err_recv	db	"Problem with recv!", 10
err_recv_d	equ	$ - err_recv


localhost	db	"127.0.0.1", 0		; address we're going
						; to connect to

struc sockaddr_in

	.sin_family	resw	1		; address family
	.sin_port	resw	1		; port number
	.sin_addr	resd	1		; address

			resb	8		; fill to 16 bytes
endstruc

adres		istruc	sockaddr_in		; address as a variable, which
						; is a structure

Since these programs use the C library, we need a different command than usually to compile them:

	nasm -f elf -o file.o file.asm
	gcc -o file file.o

After compiling, start the server, of course, with the command ./server (the server program will put itself into the background). You can check what would happen if you tried to run the server twice at the same time or run the client without the server running. Of course, a server can itself be a client of another server (for example process the received data and pass it over to something else).



Networking functions of int 80h


(skip int 80h)

Using the network is of course possible without the C library. Eventually anything that important must be a part of the system kernel.

The Linux kernel's network interface is just one function - sys_socketcall (number 102). It takes two parameters. First (EBX) is the function we want to run. Each mentioned C library function has its number. These are: socket - 1, bind - 2, connect - 3, listen - 4, accept - 5, send - 9, recv - 10. The close function is the same as for closing files (EBX=[gniazdo], EAX=6, int 80h).

The second parameter (ECX) is the address of the rest of the C-library function's parameters. You can put these on the stack in the same order as in the above programs and perform a mov ecx, esp. This is how the C library does it (the sysdeps/unix/sysv/linux/i386/socket.S file in glibc sources, you can see "ecx+4" there, because it needs to jump over the return address on the stack). You can put the data in your data section and pass their address, but these data must occupy consecutive locations in memory and be precisely in the order they would have on the stack (that is, from left to right with growing addresses). Simply in the order of the C declaration, from left to right.

There are two helper functions left to describe - htons i inet_aton.

The htons is easy (the sysdeps/i386/htons.S file in glibc sources), its contents can be put in a macro like this (assuming the parameter is in EAX):

	%macro htons 0
		and	eax, 0FFFFh
		ror	ax, 8
	%endm

This function simply zeroes out the higher half of EAX and exchanges the values of AH and AL.

The inet_aton function (the resolv/inet_addr.c file in glibc source) is a bit more difficult. I prefer to make things shorter: put the address in EAX in binary form, for example 127.0.0.1 is EAX=7F000001h, and 192.168.0.2 is EAX=C0A80002h. Then you have to reverse the byte order. Best is to use the following macro from the beginning:

	%macro adr2bin 4

		mov	al, %4
		shl	eax, 8
		mov	al, %3
		shl	eax, 8
		mov	al, %2
		shl	eax, 8
		mov	al, %1
	%endm

	; usage:
		adr2bin	127, 0, 0, 1	   ; for 127.0.0.1
		adr2bin	192, 168, 45, 243  ; for 192.168.45.243

The result of this macro (EAX) should be put in the first four bytes of the sin_addr field in the sockaddr_in structure (which was automatically done by inet_aton).

This whole reversing is connected with the fact, that the TCP byte order is big-endian, and Intel-compatible processors are little-endian.

How to write daemons using only int 80h is described in the tutorial about writing resident programs.



Networking functions in a 64-bit system


(skip the 64-bit system)

Networking in a 64-bit system is a bit different than in a 32-bit one. Not only the function number changes, but now the different network functions have their own system functions: socket - 41, connect - 42, accept - 43, sendto - 44, recvfrom - 45, bind - 49, listen - 50. The rest of the parameters is passed not on the stack, but in registers, according to the 64-bit system interface (the order of the registers is: RDI, RSI, RDX, R10, R8, R9). The system call itself is done by issuing the syscall processor instruction, not by calling int 80h.

Example system calls follow:

	mov	rax, 41			; socket
	mov	rdi, AF_INET
	mov	rsi, SOCK_STREAM
	mov	rdx, IPPROTO_TCP
	syscall

	mov	rax, 42			; connect
	mov	rdi, [socket]
	mov	rsi, sock_struc
	mov	rdx, sockaddr_in_size
	syscall

	mov	rax, 44			; sendto
	mov	rdi, [socket]
	mov	rsi, buf
	mov	rdx, buf_len
	mov	r10, 0
	syscall

	mov	rax, 49			; bind
	mov	rdi, [socket]
	mov	rsi, sock_struc
	mov	rdx, sockaddr_in_size
	syscall

	mov	rax, 50			; listen
	mov	rdi, [socket]
	mov	rsi, MAXCLIENT
	syscall

	mov	rax, 43			; accept
	mov	rdi, [socket]
	mov	rsi, sock_struc
	mov	rdx, sockaddr_in_size
	syscall

	mov	rax, 45			; recvfrom
	mov	rdi, [socket_client]
	mov	rsi, buf
	mov	rdx, buf_len
	mov	r10, 0
	syscall

	...
struc sockaddr_in
	.sin_family:	resw 1
	.sin_port:	resw 1
	.sin_addr:	resd 1
			resb 8
endstruc

sock_struc istruc sockaddr_in

The htons and inet_aton are the same as for a 32-bit system (the network byte order doesn't change).



Two more things are worth mentioning. The first is the strace and ltrace programs. They allow you to trace which functions and when your program uses. If something isn't working, turn off the daemon mode in the server and run strace ./server and watch which function calls cause problems. The same can be done with the client, too, of course, for example on a second terminal. Read the manual pages for details.

Second thing is for those of you, who are seriously thinking about writing network applications. This thing is the RFC (Request For Comment) set of regulations. These documents describe all public protocols, like HTTP, SMTP or POP3: rfc-editor.org.


On-line contents (access key 2)
Helpers for people with disabilities (access key 0)