Re: UDP kernel sockets
Re: UDP kernel sockets
- Subject: Re: UDP kernel sockets
- From: Adi Masputra <email@hidden>
- Date: Tue, 10 Jun 2008 07:08:17 -0700
Michael,
Please file a radar against this; thanks.
Adi
On Jun 10, 2008, at 5:47 AM, Michael Tüxen wrote:
Dear all,
I have done some further testing...
I think it is a bug in the kernel:
sock_receive_internal() in kpi_socket.c does:
if (msg && msg->msg_control) {
if ((size_t)msg->msg_controllen < sizeof(struct cmsghdr)) return
EINVAL;
if ((size_t)msg->msg_controllen > MLEN) return EINVAL;
control = m_get(M_NOWAIT, MT_CONTROL);
if (control == NULL) return ENOMEM;
memcpy(mtod(control, caddr_t), msg->msg_control, msg-
>msg_controllen);
control->m_len = msg->msg_controllen;
}
/* let pru_soreceive handle the socket locking */
error = sock->so_proto->pr_usrreqs->pru_soreceive(sock, &fromsa,
auio,
data, control ? &control : NULL, &flags);
Now assume that msg != NULL and msg->msg_control != NULL. So an mbuf
is allocated and
a pointer to it is stored in control. Now pru_soreceive is called.
For UDP this is
just soreceive() from uipc_socket.c.
soreceive() starts with
soreceive(struct socket *so, struct sockaddr **psa, struct uio *uio,
struct mbuf **mp0, struct mbuf **controlp, int *flagsp)
{
...
if (controlp)
*controlp = 0;
So the pointer to the mbuf allocated in sock_receive_internal is
lost and the mbuf leaked.
Lateron *controlp is set to point to an mbuf containing the control
information, which is
why the code works and that mbuf is freed.
Is this analysis correct and is it a bug in Mac OS X?
As a fix I think one can just remove
control = m_get(M_NOWAIT, MT_CONTROL);
if (control == NULL) return ENOMEM;
memcpy(mtod(control, caddr_t), msg->msg_control, msg-
>msg_controllen);
control->m_len = msg->msg_controllen;
or is there any information passed in pru_soreceive via controlp? If
that is the case,
the soreceive should free the mbuf and not only throw away a pointer
to it...
Best reagards
Michael
On Jun 9, 2008, at 6:48 PM, Michael Tüxen wrote:
Dear all,
I'm using a UDP kernel socket created by (error handling not shown):
error = sock_socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP,
sctp_over_udp_ipv4_cb, NULL, &sctp_over_udp_ipv4_so);
error = sock_setsockopt(sctp_over_udp_ipv4_so, IPPROTO_IP,
IP_RECVDSTADDR, (const void *)&on, (int)sizeof(int));
memset((void *)&addr_ipv4, 0, sizeof(struct sockaddr_in));
addr_ipv4.sin_len = sizeof(struct sockaddr_in);
addr_ipv4.sin_family = AF_INET;
addr_ipv4.sin_port = htons(sctp_udp_tunneling_port);
addr_ipv4.sin_addr.s_addr = htonl(INADDR_ANY);
error = sock_bind(sctp_over_udp_ipv4_so, (const struct sockaddr
*)&addr_ipv4);
and the sctp_over_udp_ipv4_cb looks like (also error handling
suppressed):
void
sctp_over_udp_ipv4_cb(socket_t udp_sock, void *cookie, int watif)
{
errno_t error;
size_t length;
mbuf_t packet;
struct msghdr msg;
struct sockaddr_in src, dst;
char cmsgbuf[CMSG_SPACE(sizeof (struct in_addr))];
struct cmsghdr *cmsg;
struct ip *ip;
struct mbuf *ip_m;
bzero((void *)&msg, sizeof(struct msghdr));
bzero((void *)&src, sizeof(struct sockaddr_in));
bzero((void *)&dst, sizeof(struct sockaddr_in));
bzero((void *)cmsgbuf, CMSG_SPACE(sizeof (struct in_addr)));
msg.msg_name = (void *)&src;
msg.msg_namelen = sizeof(struct sockaddr_in);
msg.msg_iov = NULL;
msg.msg_iovlen = 0;
msg.msg_control = (void *)cmsgbuf;
msg.msg_controllen = CMSG_LEN(sizeof (struct in_addr));
msg.msg_flags = 0;
length = (1<<16);
error = sock_receivembuf(udp_sock, &msg, &packet, 0, &length);
for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg,
cmsg)) {
if ((cmsg->cmsg_level == IPPROTO_IP) && (cmsg->cmsg_type ==
IP_RECVDSTADDR)) {
dst.sin_family = AF_INET;
dst.sin_len = sizeof(struct sockaddr_in);
dst.sin_port = htons(sctp_udp_tunneling_port);
memcpy((void *)&dst.sin_addr, (const void *)CMSG_DATA(cmsg),
sizeof(struct in_addr));
}
}
ip_m = sctp_get_mbuf_for_msg(sizeof(struct ip), 1, M_DONTWAIT, 1,
MT_DATA);
ip_m->m_pkthdr.rcvif = packet->m_pkthdr.rcvif;
ip = mtod(ip_m, struct ip *);
bzero((void *)ip, sizeof(struct ip));
ip->ip_v = IPVERSION;
ip->ip_len = length;
ip->ip_src = src.sin_addr;
ip->ip_dst = dst.sin_addr;
SCTP_HEADER_LEN(ip_m) = sizeof(struct ip) + length;
SCTP_BUF_LEN(ip_m) = sizeof(struct ip);
SCTP_BUF_NEXT(ip_m) = packet;
sctp_input_with_port(ip_m, sizeof(struct ip), src.sin_port);
return;
}
The code works. However, for each received UDP packet one mbuf for
ancillary data is leaked.
So after a short transfer I see:
[mbp15:~] tuexen% netstat -m
16933/17080 mbufs in use:
473 mbufs allocated to data
16459 mbufs allocated to ancillary data
1 mbufs allocated to Appletalk data blocks
147 mbufs allocated to caches
2854/3664 mbuf 2KB clusters in use
0/16 mbuf 4KB clusters in use
0/0 mbuf 16KB clusters in use
7392 KB allocated to network (69.6% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to drain routines
Is the code above not correct or is it a bug in the kernel?
Thank you very much for your help!
Best regards
Michael
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden