My first contribution to OSS, Linux. Cosmetic cleanup for a tiny bug which was introduced in 1996 when I was born.
See also: [Git](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5000b28b0b1a34144b39376318cafb8c2a0f79fd").
```
commit 5000b28b0b1a34144b39376318cafb8c2a0f79fd
Author: Kuniyuki Iwashima <kuni1840@gmail.com>
Date: Tue Dec 10 02:41:48 2019 +0000
tcp: Cleanup duplicate initialization of sk->sk_state.
When a TCP socket is created, sk->sk_state is initialized twice as
TCP_CLOSE in sock_init_data() and tcp_init_sock(). The tcp_init_sock() is
always called after the sock_init_data(), so it is not necessary to update
sk->sk_state in the tcp_init_sock().
Before v2.1.8, the code of the two functions was in the inet_create(). In
the patch of v2.1.8, the tcp_v4/v6_init_sock() were added and the code of
initialization of sk->state was duplicated.
Signed-off-by: Kuniyuki Iwashima <kuni1840@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 8a39ee794891..09e2cae92956 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -443,8 +443,6 @@ void tcp_init_sock(struct sock *sk)
tp->tsoffset = 0;
tp->rack.reo_wnd_steps = 1;
- sk->sk_state = TCP_CLOSE;
-
sk->sk_write_space = sk_stream_write_space;
sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
```
When you bind sockets with the *SO_REUSEPORT* option to the same port (e.g., nginx), they belong to the same structure *sock_reuseport* up to the limit managed by *sock_reuseport.max_socks*. When the number of sockets exceeds the limit, *reuseport_grow()* doubles it. The initialization is done in *__reuseport_alloc()* and *reuseport_grow()*. This commit removes the latter one.
What is the default value of *max_socks*? hmm...128! So *reuseport_grow()* will never be called ;-)
See also: [Git](https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=cd94ef06392ffd49e0a0e1c28bc5cd44f37f1f6b).
```
commit cd94ef06392ffd49e0a0e1c28bc5cd44f37f1f6b
Author: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Date: Sat Jan 25 10:41:02 2020 +0000
soreuseport: Cleanup duplicate initialization of more_reuse->max_socks.
reuseport_grow() does not need to initialize the more_reuse->max_socks
again. It is already initialized in __reuseport_alloc().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
index f19f179538b9..91e9f2223c39 100644
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -107,7 +107,6 @@ static struct sock_reuseport *reuseport_grow(struct sock_reuseport *reuse)
if (!more_reuse)
return NULL;
- more_reuse->max_socks = more_socks_size;
more_reuse->num_socks = reuse->num_socks;
more_reuse->prog = reuse->prog;
more_reuse->reuseport_id = reuse->reuseport_id;
```
This patch set includes 4 patches.
- [tcp: Remove unnecessary conditions in inet_csk_bind_conflict().](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=16f6c2518f9e0347eb54d368473ebd0904ac4298)
- [tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4b01a9674231a97553a55456d883f584e948a78d)
- [tcp: Forbid to bind more than one sockets haveing SO_REUSEADDR and SO_REUSEPORT per EUID.](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=335759211a327d61244580070d74f55561c35895)
- [selftests: net: Add SO_REUSEADDR test to check if 4-tuples are fully utilized.](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7f204a7de8b08542aca3c1daa96ed20e1177ba87)
Without these patches, we fail to bind sockets to ephemeral ports when all of the ports are exhausted even if all sockets have SO_REUSEADDR enabled. In this case, we still have a chance to connect to the different remote hosts.
I added net.ipv4.ip_autobind_reuse option and fixed the behaviour to fully utilize all space of the local (addr, port) tuples.
```
commit a594920f8747fa032c784c3660d6cd5a8ab291f8
Author: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Date: Sat Jul 11 00:57:59 2020 +0900
inet: Remove an unnecessary argument of syn_ack_recalc().
Commit 0c3d79bce48034018e840468ac5a642894a521a3 ("tcp: reduce SYN-ACK
retrans for TCP_DEFER_ACCEPT") introduces syn_ack_recalc() which decides
if a minisock is held and a SYN+ACK is retransmitted or not.
If rskq_defer_accept is not zero in syn_ack_recalc(), max_retries always
has the same value because max_retries is overwritten by rskq_defer_accept
in reqsk_timer_handler().
This commit adds three changes:
- remove redundant non-zero check for rskq_defer_accept in
reqsk_timer_handler().
- remove max_retries from the arguments of syn_ack_recalc() and use
rskq_defer_accept instead.
- rename thresh to max_syn_ack_retries for readability.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
CC: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index afaf582a5aa9..22b0e7336360 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -648,20 +648,19 @@ struct dst_entry *inet_csk_route_child_sock(const struct sock *sk,
EXPORT_SYMBOL_GPL(inet_csk_route_child_sock);
/* Decide when to expire the request and when to resend SYN-ACK */
-static inline void syn_ack_recalc(struct request_sock *req, const int thresh,
- const int max_retries,
- const u8 rskq_defer_accept,
- int *expire, int *resend)
+static void syn_ack_recalc(struct request_sock *req,
+ const int max_syn_ack_retries,
+ const u8 rskq_defer_accept,
+ int *expire, int *resend)
{
if (!rskq_defer_accept) {
- *expire = req->num_timeout >= thresh;
+ *expire = req->num_timeout >= max_syn_ack_retries;
*resend = 1;
return;
}
- *expire = req->num_timeout >= thresh &&
- (!inet_rsk(req)->acked || req->num_timeout >= max_retries);
- /*
- * Do not resend while waiting for data after ACK,
+ *expire = req->num_timeout >= max_syn_ack_retries &&
+ (!inet_rsk(req)->acked || req->num_timeout >= rskq_defer_accept);
+ /* Do not resend while waiting for data after ACK,
* start to resend on end of deferring period to give
* last chance for data or ACK to create established socket.
*/
@@ -720,15 +719,12 @@ static void reqsk_timer_handler(struct timer_list *t)
struct net *net = sock_net(sk_listener);
struct inet_connection_sock *icsk = inet_csk(sk_listener);
struct request_sock_queue *queue = &icsk->icsk_accept_queue;
- int qlen, expire = 0, resend = 0;
- int max_retries, thresh;
- u8 defer_accept;
+ int max_syn_ack_retries, qlen, expire = 0, resend = 0;
if (inet_sk_state_load(sk_listener) != TCP_LISTEN)
goto drop;
- max_retries = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_synack_retries;
- thresh = max_retries;
+ max_syn_ack_retries = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_synack_retries;
/* Normally all the openreqs are young and become mature
* (i.e. converted to established socket) for first timeout.
* If synack was not acknowledged for 1 second, it means
@@ -750,17 +746,14 @@ static void reqsk_timer_handler(struct timer_list *t)
if ((qlen << 1) > max(8U, READ_ONCE(sk_listener->sk_max_ack_backlog))) {
int young = reqsk_queue_len_young(queue) << 1;
- while (thresh > 2) {
+ while (max_syn_ack_retries > 2) {
if (qlen < young)
break;
- thresh--;
+ max_syn_ack_retries--;
young <<= 1;
}
}
- defer_accept = READ_ONCE(queue->rskq_defer_accept);
- if (defer_accept)
- max_retries = defer_accept;
- syn_ack_recalc(req, thresh, max_retries, defer_accept,
+ syn_ack_recalc(req, max_syn_ack_retries, READ_ONCE(queue->rskq_defer_accept),
&expire, &resend);
req->rsk_ops->syn_ack_timeout(req);
if (!expire &&
```
This patch set addresses two issues which happen when both connected and
unconnected sockets are in the same UDP reuseport group.
- [udp: Copy has_conns in reuseport_grow().](https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f2b2c55e512879a05456eaf5de4d1ed2f7757509)
- [udp: Improve load balancing for SO_REUSEPORT.](https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=efc6b6f6c3113e8b203b9debfb72d81e0f3dcace)