[ bandan.das.name | feedburner ]
Recently, a friend of mine asked :
Why does the struct list_head in the Linux kernel has no data
field ?
In other words, what he means is this:
The way we are introduced to linked lists in high school is
struct linked_list {
void *data;
struct linked_list *prev *next;
}
And assume, we have a user defined data structure:
struct user_data {
int a;
char b;
}
The approach we usually use to create a list would be :
struct linked_list head;
struct user_data mydata1;
head.data = (void *)&mydata1;
head.next = NULL;
head.prev = NULL;
And to add another node to the list :
struct linked_list newentry;
struct user_data mydata2;
newentry.data = (void *)&mydata2;
newentry.prev = &head;
newentry.next = NULL;
and so on..
However in the Linux kernel, a linked list structure is defined
something like:
struct list_head {
struct list_head *next, *prev;
}
and to do the same thing as we did above, we do something like :
struct user_data {
int a;
char b;
struct list_head new_list;
}
/* Create head */
struct user_data head;
head.new_list.prev=NULL;
head.new_list.next=NULL; ...
/* Add a new node */
struct user_data newentry; ...
newentry.new_list.prev=&head.new_list;
head.new_list.next = &newentry.new_list;
So, in one case we have the data embedded into the linked list
(high school style) and in the other, we have the linked list
embedded into data. So, what's the advantage of the second over
the first ?
I actually couldn't find any documentation justifying this design
but these obvious reasons come to my mind.
One advantage is that the kernel programmer is relieved from the
additional care he has to take during typecasting if she chooses
the first approach. As we all know, typecasting is a necessary
evil in C, but using it for linked lists that's invariably used
almost everywhere in the kernel is likely to double the number of
kernel bugs!
Second, with the first case, with each node created, we end up
using more space compared to the second approach. This is because
struct linked_list has an additional void *data defined thereby
making the node larger in size. This is definitely an important
issue in embedded systems if not on x86.
Last but not the least, an important advantage that this design
offers is flexibility in list walking. For example, if you have
the address of newentry (see above example), you can access
newentry->new_list and then go back and forth. Then, using the
container_of() macros, any of the associated user_data for any
node could be accessed! Even if we just have the address of
list_head somewhere in the middle of a linked list, we could jump
to the associated data structure. With the first design, in order
to do list walking and modifications, we always need the address
of any of the node. This means that you always have to pass struct
linked_list pointers even if all you really wanted to do is
manipulate struct user_data.