Gwren's Home Page

Gwren's Home Page
Site administrated by Gwren. You can find me on IRC as Gwren.

40462519

Index

ANts P2P Project Description

ANts P2P realizes a third generation P2P net.
It protects your privacy while you are connected and makes you not trackable,
hiding your identity (ip) and crypting everything you are sending/receiving from others.

"I worry about my child and the Internet all the time, even though she's too young to have logged on yet.
Here's what I worry about. I worry that 10 or 15 years from now, she will come to me and say 'Daddy, where
were you when they took freedom of the press away from the Internet?'"
--Mike Godwin, Electronic Frontier Foundation

If you like ANts P2P and you want to support this project do a donation!

Features

Open Source Java implementation (GNU-GPL license).
Multiple sources download.
Swarming download from partial files.
Automatic resume and sources research over the net.
Search by hash [ED2K - ANts2K] or string.
Embedded support for etherogeneus data types (not only arrays of bytes...).
Completely Object-Oriented routing protocol.
Point to Point secured comunication: EDH(512)-AES(128)
EndPoint to EndPoint secured comunication: EDH(512)-AES(128)
Automatic serverless peer dicovery procedure through webcaches, IRC or Jabber.
Support for emule ip block lists (ipfilter.dat)
IRC embeded chat system.
Full text search of indexed documents (pdf, html, txt, doc etc) -> QUERY REFERENCE.
Distributed/Decentralized Search engine base on a supernode architecture (like gnutella)
HTTP tunneling.

Abstract

The main problems of the 2-nd generation p2p nets (as well as the 1-st one) are two: the complete lack of privacy and data protection, and the lack of structured queries. Doing queries over the net we can see who's connected and what he's sharing. This is not acceptable if we care about our privacy... furthermore we usually have many problems in finding what we are looking for: queries support in common p2p system is usually not-stuctured and exact text-matching based, so we cannot exploit SQL-like features (join and so on...) over the net data-set. This project tries to resolve both the problems.

The privacy problem

ANts faces the privacy problem breaking the ebsence of p2p... I mean a connection is not anymore point 2 point in the strict sense... the peers are virtual peers over a virtual net, so when we are requiring a resource over the net, our request is routed through many nodes utill it reaches our peer. The peers are not anymore identified by ip... the have a unique ID that is also a public key used to verify the signatures of the messages issued by a node (who is actually the only one able to sign messages with a stated publick key, i.e. ID), this way we prevent spoofing. So a client now knows only the ips of its neighbours (the other peers directly connected to it), but it doesn't know their ID, as only the same node knows its ID. So what about routing, how can a node route a message if it doesn't know where the destination is? Simple... a node will know which are the "best" direction to route a message to, but it will not know where precisely another node is. The routing protocol has been developed over studies on ants behaviour... ants do not know the precise location of their hive, they simply follow a track... the same happends in this system. So the more messages follow a track the more that track will be "strong", if a track produces many failures it'll fade out and it won't be followed anymore. This way we can achieve privacy over our identity, but what about the informations sent? They have to route through many peers so how can we protect 'em? The protection is realized at two levels... low level (against man-in-the-middle extern to our net) by crypting communication between each couple of directly linked points of the net, high level (against internal threats) crypting the communication between the two end points. At both level the security is granted using a DH-KA and DES or AES (negotiated at the beginning).

The queries problem

Another solution has been found for the query problem. Each query is distributed (in a non deterministic and sequential way) over a part of the net. It is processed by each node it passes and at last it is returned to the source following the shortest path. Each node can process operations more complex than the simple text-matching: we can support pseudo high level SQL-queries over the data-set reppresented by our partial explored net. In order to support wide range queries a two layer system has been developed in a way very similar to the gnutella supernodes overlay network. The Ants network exploits anonymous supernodes (with high bandwith capabilities) elected by a bidding procedure, to let the whole network scale freely, in a way that let every node access to every file on the network (partial or completed).

Efficiency issues

We talked about a wide spread net that comprises very etherogeneous kind of nodes (lan connection as well as 56k) how can we support efficiency? Real ants behaviour is the key point! Actually the ANts network is realized in a way to incrementally mark more and more paths among nodes. As long as new paths are marked messages are distributed over the less loaded nodes in a way that let us achive speed exactly as BitTorrent does. The load is uniformly distributed over the net and the overal performance results really good. Downloads are realized by swarming through nodes downloading the same file, using a non-deterministic and stocastic process of data distribution. This grants a very good "dispersion" of the information among nodes, without having a central tracker like the one required by BitTorrent

About the IRC based connection system

Is there any threat due to the IRC based connection system? Is it dangerous that my IP appears on IRC? NO! Because ANts Net and IRC net are two distinct nets and the first has nothing to do with the sencond. I mean, when you log on IRC starting your ants client you usually join a chat room. In the chatroom you'll find other peers, each peers in the channel runs a particular server at port 4568, this is called the address server. The address server is a sort of crawler that roams around the ANts net collecting the IPs of every node that has free slots for new peers that want to access the ANts net. So each node of the ANts net will manage a list of these IPs with free slots. Through IRC (by a simple message) you can obtain the address of a peer running the server and already connected to the net. Once you have the address of the address server you can query that peer for IPs you can connect to. Once you are connected to the ANts net you'll crawl the net too for addresses with free slot (I mean your address server), and other peers will crawl through your node. This cause no threat because giving back your address to a query does not give any info to other peers about your location or about the ID you are using. Also if you are the only peer connected to another, the other cannot know if you are the only peer connected or if other peers without free slots are connected to you (same as for normal queries... noone knows who really answers queries).

A very good explaination of the way ANts realizes anonymous transfers can be found on the MUTE website. Actually MUTE and ANts share the same anonymizing technique. MUTE chooses to not use end point encryption, while ANts realizes a double level protection applying end point encryption to every comunication. This grant a good protection also INSIDE the ANts network. In MUTE every node is able to read everything pass through it, not the same in ANts, actually a node, though playing as a man in the middle, will be thwarted in its attack, because of the many different and non deterministic paths taken by messages from a source to a destination into the net.

Interesting links

MUTE: a similar system. Mute implements the same idea as Ants p2p using different routing protocols and discovery systems. Also the security policy is different as no endpoint secured connections are used (it uses only point to point secured connections). Main problems of MUTE are the lack of multiple downloads and of the resume system. Sometimes It can also result too slow even if it is used with a broadband connection (probably due to wrong routing policies...).

Webmaster: Gwren