Wednesday, September 24, 2014

Anonymous paper reviews and threat of a legal action

I just stumbled on a news story in which scientist claims that his career was severely damaged by anonymous comments on some of his works published on PubPeer. This is very interesting story to follow for several reasons.

For a start, PubPeer is a site for a post publication review. I strongly support such a practice because I believe that everything has to be scrutinized and tested, and it helps authors who can get the best possible feedback, but also helps society in general, too because there is ever increasing problem with scientific ethic. As a side note, I was, and I'm still a big proponent of doing review process in public. That, in my opinion, significantly increases transparency. Anyway, PubPeer fulfils my wishes, but unfortunately for me, it is only concerned with papers from medicine, chemistry and related fields, not from computer science.

In this particular case, the problem is that the author was offered a job on the University of Mississippi, with quite a large annual salary, and for that purpose he quitted his current job. University then revoked the offer and so he lost both the new job, and his current job. Now, he claims that the reason for this are some anonymous negative comments on PubPeer and threatens with a lawsuit asking for identities of those who made those negative claims.

While, as I said, it is very good to have such a site, it doesn't mean that everything should be allowed, more specifically:
  1. Any claims made have to be justified. Unfortunately, anonymity also allows people to make damaging or unjustified claims by being certain that there will be no repercussions.
  2. Unfortunately, negative claim even if not justified casts doubts, so that might be a problem.
  3. In this particular case it is also unknown why the author didn't respond to presented claims about problems in his paper. PubPeer claims they invite first and last author to comment on comments.
  4. Finally, no one should take lightly claims about some paper being invalid, not good, etc. In this particular case, I hope that University of Mississippi verified negative claims and that they didn't take lightly what some anonymous commenters said.
In any case, we'll see what will happen with this particular case.

Saturday, August 23, 2014

Memory access latencies

Once, I saw a table in which all the memory latencies are scaled in such a way that CPU cycle is defined to be 1 second, and then L1 cache latency is several seconds, L2 cache even more, and so on up to SCSI commands timeout and system reboot. This was very interesting because I have much better developed sense for seconds and higher time units that for nanoseconds, microseconds, etc. Few days ago I remembered that table and I wanted to see it again, but couldn't find it.  This was from some book I couldn't remember the name. So, I started to google for it, and finally, after an hour or so of googling, I managed to find this picture. It turns out that this was from the book Systems performance written by Brendan Gregg. So, I decided to replicate it here for a future reference:

Table 2.2: Example Time Scale of System Latencies
Event Latency Scaled
1 CPU Cycle 0.3 ns 1 s
Level 1 cache access 0.9 ns 3 s
Level 2 cache access 2.8 ns 9 s
Level 3 cache access 12.9 ns 43 s
Main memory access (DRAM, from CPU) 120 ns 6 min
Solid-state disk I/O (flash memory) 50 - 150 us 2-6 days
Rotational disk I/O 1-10 ms 1-12 months
Internet: San Francisco to New York 40 ms 4 years
Internet: San Francisco to United Kingdom 81 ms 8 years
Internet: San Francisco to Australia 183 ms 19 years
TCP packet retransmit 1-3 s 105-317 years
OS virtualization system reboot 4 s 423 years
SCSI command timeout 30 s 3 millennia
Hardware (HW) virtualization system reboot 40 s 4 millennia
Physical system reboot 5 min 32 millennia

It's actually impressive how fast CPU is with respect to other components. It is also very good argument for multitasking, i.e. assigning CPU to some other task while waiting for, e.g. disk, or something from the network.

One additional impressive thing is written below the table in the book. Namely, if you multiply CPU cycle with speed of light (c) you can see that the light can travel only 0.5m while CPU does one instruction. That's really impressive. :)

That's it for this post. For the end, while I was searching for this table, I stumbled on some additional interesting links:

Sunday, June 29, 2014

Private addresses in IPv6 protocol

It is almost a common wisdom that,, and are private network addresses that should be used when you don't have assigned address, or you don't intend to connect to the Internet (at least not directly). With IPv6 being ever more popular, and necessary, the question is which addresses are used for private networks in that protocol. In this post I'll try to answer that question.

The truth is that in IPv6 there are two types of private addresses, link local and unique local addresses. Link local IPv6 addresses, as the name suggests, are valid only on a single link. For example, on a single wireless network. You'll recognize those addresses by their prefix, which is fe80::/10, and they are automatically configured by appending interface's unique ID. IPv4 also has link local address, though it is not so frequently used. Still, maybe you noticed it when your DHCP didn't work and suddenly you had address that starts with This was a link local IPv4 address configured. The problem with link local addresses is that they can not be used in case you try to connect two or more networks. They are only valid on a single network, and packets having those addresses are not routable! So, we need something else.

Unique local addresses (ULA), defined in RFC4193, are closer to IPv4 private addresses. That RFC defines ULA format and how to generate them. Basically, those are addresses with the prefix FC00::/7. These addresses are treated as normal, global, addresses, but are only valid inside some restricted area and can not be used on the global Internet. This is the same as saying that addresses can be used within some private networks, but are not allowed on a global Internet. You choose how this conglomerate of networks will be connected, what prefixes used, etc.

There is  difference, though. Namely, it is expected that ULA will be unique in the world. You might ask why is that important, when those addresses are not allowed on the Internet anyway. But, that is important. Did it ever happened to you that you had to connect two private IPv4 networks (directly via router, via VPN, etc.), and coincidentally, both used, e.g. prefix? Such situations are a pain to debug, and require renumbering or some nasty tricks to make them work. So, being unique is an important feature.

So, the mentioned RFC, actually specifies how to generate ULA with /48 prefix and a high probability of the prefix being unique. Let's first see the exact format of ULA:
| 7 bits |1|  40 bits   |  16 bits  |          64 bits           |
| Prefix |L| Global ID  | Subnet ID |        Interface ID        |
First 8 bits have a fixed value 0xFD. As you can see, prefix is 7 bit, but L bit must be set to 1 if the address is specified according to the RFC4193. So, first 8 bits are fixed to the value 0xFD. Note that L bit set to 0 isn't specified, it is something left for the future. Now, the main part is Global ID, whose length is 40 bits. That one must be generated in such a way to be unique with high probability. This is done in the following way:
  1. Obtain current time in a 64-bit format as specified in the NTP specification.
  2. Obtain identifier of a system running this algorithm (EUI-64, MAC, serial number).
  3. Concatenate the previous two and hash the concatenated result using SHA-1.
  4. Take the low order 40 bits as a Global ID.
The prefix obtained can be used now for a site. Subnet ID can be further used for multiple subnets within a site. There are Web based implementations of the algorithm you can use to either get a feeling of the generated addresses, or to generate prefix for your concrete situation.

Occasionally you'll stumble upon so called site local addresses. Those addresses were defined starting with the initial IPv6 addressing architecture in RFC1884 and were also defined in subsequent revisions of addressing architecture (RFC2373, RFC3513) but were finally deprecated in RFC3879. Since they were defined for so long (8 years) you might stumble upon them in some legacy applications. They are recognizable by their prefix FEC0::/10. You shouldn't use them any more, but use ULA instead.

About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)