Discussion:
[Linuxptp-devel] Implement bond/team fail over support
Hangbin Liu
2017-06-06 10:22:28 UTC
Permalink
Hi Richard,

Now more and more customers are asking for linuxptp bond/team fail over support.
I have a draft plan to implement this feature. Would you like to help review it
to see if it's OK or not?

Here are the brief descriptions.

1. add slave info in struct interface, something like
struct interface {
STAILQ_ENTRY(interface) list;
char name[MAX_IFNAME_SIZE + 1];
char slave[MAX_IFNAME_SIZE + 1];
int index;
int slave_index;
int linkup;
int slave_changed;
struct sk_ts_info ts_info;
};
2. get current interface's active slave via netlink message when
config_create_interface()
3. set clock's PHC index to active slave's PHC index when clock_create()
4. when bond/team fail over, get new active slave index
5. update clock phc info via clock_switch_phc(), update port status
via port_dispatch()
6. Use automatic mode in phc2sys to automatically sync correct PHC.

Limits and unresolved issues:
1. Only support bond/team active-backup
2. Only works with phc2sys automatic mode. But how about manual
config? -s bond0?
3. What about vlan?
4. What if new active slave do not support required timestamp flags.
Just report fail?

Thanks
Hangbin
Hangbin Liu
2017-06-08 08:50:27 UTC
Permalink
Hi Richard,

Do you have any comments?

Regards
Hangbin
Post by Hangbin Liu
Hi Richard,
Now more and more customers are asking for linuxptp bond/team fail over support.
I have a draft plan to implement this feature. Would you like to help review it
to see if it's OK or not?
Here are the brief descriptions.
1. add slave info in struct interface, something like
struct interface {
STAILQ_ENTRY(interface) list;
char name[MAX_IFNAME_SIZE + 1];
char slave[MAX_IFNAME_SIZE + 1];
int index;
int slave_index;
int linkup;
int slave_changed;
struct sk_ts_info ts_info;
};
2. get current interface's active slave via netlink message when
config_create_interface()
3. set clock's PHC index to active slave's PHC index when clock_create()
4. when bond/team fail over, get new active slave index
5. update clock phc info via clock_switch_phc(), update port status
via port_dispatch()
6. Use automatic mode in phc2sys to automatically sync correct PHC.
1. Only support bond/team active-backup
2. Only works with phc2sys automatic mode. But how about manual
config? -s bond0?
3. What about vlan?
4. What if new active slave do not support required timestamp flags.
Just report fail?
Thanks
Hangbin
Richard Cochran
2017-06-08 09:52:14 UTC
Permalink
Post by Hangbin Liu
Do you have any comments?
I was waiting for Miroslav to comment first ;)

He did suggest something some time ago. I have not yet thought much
about bonding, and so I not yet ready to give an opinion...

Thanks,
Richard
Miroslav Lichvar
2017-06-08 13:50:13 UTC
Permalink
Post by Richard Cochran
I was waiting for Miroslav to comment first ;)
He did suggest something some time ago. I have not yet thought much
about bonding, and so I not yet ready to give an opinion...
My suggestion was to modify the kernel to support timestamping
directly on the bonding master interface, so ptp4l could work with it
as with a real interface (except maybe it would need to reopen the PHC
when the active slave is changed). phc2sys would need to know which
slave interfaces are in the bond, keep their clocks synchronized to
each other and track the active slave.

What Hangbin is suggesting is different in that most of the work is
done in ptp4l and there are no or only minimal changes needed in the
kernel and phc2sys. ptp4l would use the master interface for
networking and slave interfaces for timestamping configuration and
clock control. It would need to know which slave interface is active
and react to changes. I think it makes sense and I like it better than
what I suggested.

The question is if you would be ok with bonding/teaming-specific code
in ptp4l and if you have any suggestions on the design. Should there
be only one port, or should each slave interface has its own port? The
later would work better with phc2sys, but I suspect inactive ports
would have to close their sockets (bound to the master interface), so
only one is open at the time.

As for supporting other bonding modes than active-backup, I think that
would require a completely different approach and I'm not sure if it
actually makes sense with PTP as implemented in linuxptp. Each slave
interface would probably need to have a separate clock, which would
receive only a fraction of the PTP traffic. A single ptp4l instance
would control multiple clocks at the same time and demultiplex PTP
messages from a single socket, or, if a new socket option was
implemented in the kernel to filter outgoing messages by slave
interface, it might be possible to use separate ptp4l instances. In
any case, all requests for bonding support I've seen so far were for
the active-backup mode, so I think it's ok to ignore the other modes
for now.
--
Miroslav Lichvar
Richard Cochran
2017-06-08 14:34:48 UTC
Permalink
Post by Miroslav Lichvar
What Hangbin is suggesting is different in that most of the work is
done in ptp4l and there are no or only minimal changes needed in the
kernel and phc2sys. ptp4l would use the master interface for
networking and slave interfaces for timestamping configuration and
clock control. It would need to know which slave interface is active
and react to changes. I think it makes sense and I like it better than
what I suggested.
Thanks for the clarification and opinion.
Post by Miroslav Lichvar
The question is if you would be ok with bonding/teaming-specific code
in ptp4l and if you have any suggestions on the design. Should there
be only one port, or should each slave interface has its own port?
Ok, I will think about this...

Thanks,
Richard
Hangbin Liu
2017-06-15 13:23:16 UTC
Permalink
Post by Richard Cochran
Post by Miroslav Lichvar
The question is if you would be ok with bonding/teaming-specific code
in ptp4l and if you have any suggestions on the design. Should there
be only one port, or should each slave interface has its own port?
Ok, I will think about this...
Hi Richard,

Sorry to bother you again. But customers are pushing for this.

If you like, I can send a draft patch set for you to review first.

Thanks
Hangbin
Richard Cochran
2017-06-15 15:39:12 UTC
Permalink
Post by Hangbin Liu
Sorry to bother you again. But customers are pushing for this.
Well, I have been overloaded with work, and today I am feeling sick.
So I will take a look as soon as I can, but your customers will have
to be patient.
Post by Hangbin Liu
If you like, I can send a draft patch set for you to review first.
Sure, if the patches already exist, there is no harm is posting them.

Thanks,
Richard
Richard Cochran
2017-06-17 05:30:59 UTC
Permalink
Post by Miroslav Lichvar
The question is if you would be ok with bonding/teaming-specific code
in ptp4l and if you have any suggestions on the design. Should there
be only one port, or should each slave interface has its own port?
There should be only one port, and we can re-use the PHC index
switching code already in place for JBOD.

As far as phc2sys goes, it can be patched to deal with this setup.

Thanks,
Richard
Richard Cochran
2017-06-17 05:21:08 UTC
Permalink
Hangbin

Finally getting back to this...
Post by Hangbin Liu
Now more and more customers are asking for linuxptp bond/team fail over support.
I have a draft plan to implement this feature. Would you like to help review it
to see if it's OK or not?
Overall, it sounds good. This case is similar to the JBOD case.
Post by Hangbin Liu
Here are the brief descriptions.
1. add slave info in struct interface, something like
struct interface {
STAILQ_ENTRY(interface) list;
char name[MAX_IFNAME_SIZE + 1];
char slave[MAX_IFNAME_SIZE + 1];
int index;
int slave_index;
int linkup;
int slave_changed;
I don't think this new state is needed, because...
Post by Hangbin Liu
struct sk_ts_info ts_info;
};
2. get current interface's active slave via netlink message when
config_create_interface()
3. set clock's PHC index to active slave's PHC index when clock_create()
... step 2 can also be done in clock_create(), can't it?
Post by Hangbin Liu
4. when bond/team fail over, get new active slave index
How do you detect fail over?
Post by Hangbin Liu
5. update clock phc info via clock_switch_phc(), update port status
via port_dispatch()
6. Use automatic mode in phc2sys to automatically sync correct PHC.
Another question: The sockets must be bound to the slave interface,
and not to the bonding master interface, correct? If so, there is
more work to do than clock_switch_phc() at fail over time.
Post by Hangbin Liu
1. Only support bond/team active-backup
2. Only works with phc2sys automatic mode. But how about manual
config? -s bond0?
I think phc2sys will also need some work.
Post by Hangbin Liu
3. What about vlan?
VLAN already works with normal (non-bonding) interfaces. Not sure
about the bonding case.
Post by Hangbin Liu
4. What if new active slave do not support required timestamp flags.
Just report fail?
Yes, just let the port become FAULTY.

Thanks,
Richard
Hangbin Liu
2017-06-19 02:55:21 UTC
Permalink
Hi Richard,

Thanks for your comments. Here are the replies.
Post by Richard Cochran
Post by Hangbin Liu
Here are the brief descriptions.
1. add slave info in struct interface, something like
struct interface {
STAILQ_ENTRY(interface) list;
char name[MAX_IFNAME_SIZE + 1];
char slave[MAX_IFNAME_SIZE + 1];
int index;
int slave_index;
int linkup;
int slave_changed;
I don't think this new state is needed, because...
Which state? all the new elements or just the last slave_changed variable?
Post by Richard Cochran
Post by Hangbin Liu
struct sk_ts_info ts_info;
};
2. get current interface's active slave via netlink message when
config_create_interface()
3. set clock's PHC index to active slave's PHC index when clock_create()
... step 2 can also be done in clock_create(), can't it?
In clock_create()

STAILQ_FOREACH(iface, &config->interfaces, list) {
if (iface->ts_info.valid &&
((iface->ts_info.so_timestamping & required_modes) != required_modes)) {
...
}
}

we get iface from config and check iface->ts_info.valid. The ts_info need to
be current slave's ts_info. What I parepared to do is something like

strncpy(iface->name, name, MAX_IFNAME_SIZE);
- sk_get_ts_info(iface->name, &iface->ts_info);
+ rtnl_link_info(iface);
+
+ if (iface->slave) {
+ sk_get_ts_info(iface->slave, &iface->ts_info);
+ } else {
+ sk_get_ts_info(iface->name, &iface->ts_info);
+ }

in config_create_interface(). That we get interface's slave info and ts_info
during config.

So what you want is to get new ts_info in clock_create() and do not store
slave information in struct interface, right?
Post by Richard Cochran
Post by Hangbin Liu
4. when bond/team fail over, get new active slave index
How do you detect fail over?
via rtnl_link_status() callback, we can get bond slave info in
rta->rta_type == IFLA_BOND_ACTIVE_SLAVE.
Post by Richard Cochran
Post by Hangbin Liu
5. update clock phc info via clock_switch_phc(), update port status
via port_dispatch()
6. Use automatic mode in phc2sys to automatically sync correct PHC.
Another question: The sockets must be bound to the slave interface,
and not to the bonding master interface, correct? If so, there is
Yes.
Post by Richard Cochran
more work to do than clock_switch_phc() at fail over time.
Yes, we also need to reset port phc index to new phc, set port re-initialize via
port_dispatch(p, EV_INITIALIZE, 0). Open/renew transport with slave name. etc.
And maybe more that I don't know. Please correct me if I missed anything when
I post the patch set.
Post by Richard Cochran
Post by Hangbin Liu
1. Only support bond/team active-backup
2. Only works with phc2sys automatic mode. But how about manual
config? -s bond0?
I think phc2sys will also need some work.
hmm, I'm looking what we should do to make phc2sys aware the node changed
and reconfig it.

run_pmc_events(node);
if (node->state_changed) {
reconfigure(node);
}
Post by Richard Cochran
Post by Hangbin Liu
3. What about vlan?
VLAN already works with normal (non-bonding) interfaces. Not sure
about the bonding case.
I think we can talk about this later.
Post by Richard Cochran
Post by Hangbin Liu
4. What if new active slave do not support required timestamp flags.
Just report fail?
Yes, just let the port become FAULTY.
OK, got it.

Thanks
Hangbin
Richard Cochran
2017-06-19 05:54:53 UTC
Permalink
Post by Hangbin Liu
Post by Richard Cochran
Post by Hangbin Liu
struct interface {
STAILQ_ENTRY(interface) list;
char name[MAX_IFNAME_SIZE + 1];
char slave[MAX_IFNAME_SIZE + 1];
int index;
int slave_index;
int linkup;
int slave_changed;
I don't think this new state is needed, because...
Which state? all the new elements or just the last slave_changed variable?
I imagine that none of the new state is needed.
Post by Hangbin Liu
So what you want is to get new ts_info in clock_create() and do not store
slave information in struct interface, right?
Yes, moving the single sk_get_ts_info() call site is better than
adding new state just to remember values from
config_create_interface() till clock_create(), don't you think?

Thanks,
Richard
Hangbin Liu
2017-06-19 06:56:00 UTC
Permalink
Post by Richard Cochran
Post by Hangbin Liu
Post by Richard Cochran
Post by Hangbin Liu
struct interface {
STAILQ_ENTRY(interface) list;
char name[MAX_IFNAME_SIZE + 1];
char slave[MAX_IFNAME_SIZE + 1];
int index;
int slave_index;
int linkup;
int slave_changed;
I don't think this new state is needed, because...
Which state? all the new elements or just the last slave_changed variable?
I imagine that none of the new state is needed.
Post by Hangbin Liu
So what you want is to get new ts_info in clock_create() and do not store
slave information in struct interface, right?
Yes, moving the single sk_get_ts_info() call site is better than
adding new state just to remember values from
config_create_interface() till clock_create(), don't you think?
Then how do we know the slave changed. Cause via netlink rta
IFLA_BOND_ACTIVE_SLAVE, we only know the current active slave. If we do not
store previous slave name/index some where and compare with current value.
How do we know the slave changed and phc index need update?

Thanks
Hangbin
Richard Cochran
2017-06-19 07:05:25 UTC
Permalink
Post by Hangbin Liu
Then how do we know the slave changed. Cause via netlink rta
IFLA_BOND_ACTIVE_SLAVE, we only know the current active slave. If we do not
store previous slave name/index some where and compare with current value.
How do we know the slave changed and phc index need update?
Didn't you say that there is a rtnl notification for that?

Thanks,
Richard
Hangbin Liu
2017-06-19 07:35:16 UTC
Permalink
Post by Richard Cochran
Post by Hangbin Liu
Then how do we know the slave changed. Cause via netlink rta
IFLA_BOND_ACTIVE_SLAVE, we only know the current active slave. If we do not
store previous slave name/index some where and compare with current value.
How do we know the slave changed and phc index need update?
Didn't you say that there is a rtnl notification for that?
But other device actions will also trigger a rtnl notification, i.e. mtu change.
So we need to make sure the slave actually changed before reset clock/port.

Thanks
Hangbin
Richard Cochran
2017-06-19 08:05:15 UTC
Permalink
Post by Hangbin Liu
But other device actions will also trigger a rtnl notification, i.e. mtu change.
So we need to make sure the slave actually changed before reset clock/port.
Fair enough, but that requires only one additional variable per port.

Thanks,
Richard
Hangbin Liu
2017-06-19 09:48:26 UTC
Permalink
Post by Richard Cochran
Post by Hangbin Liu
But other device actions will also trigger a rtnl notification, i.e. mtu change.
So we need to make sure the slave actually changed before reset clock/port.
Fair enough, but that requires only one additional variable per port.
Yeah... we need at least one variable to store slave name. We can get master
or slave's index by if_nametoindex. It was just for conventent usage that I
add the index variable. I will re-modify my patch set.

Thanks
Hangbin

Loading...