Ask Your Question

Danube giving a new node an old hostname

asked 2017-10-11 10:31:50 -0800

jokken gravatar image

updated 2017-10-11 10:35:02 -0800

hi all

I am using Danube Fuel 10. I have a known good environment, I've used fuel to deploy the controllers, etc and 10 compute nodes so far. I have instances running on those nodes..

when attempting to deploy a 11th compute node I am running into issues. Way back, before the 10 compute nodes, there was a previous attempt to deploy this node, this attempt was unsuccessful because of a bad hard drive. Now that this is fixed I am deploying again.

Back then Fuel refered to it as hostname node-20. Now fuel is using hostname node-31. Everything I see on the surface, including the fuel2 CLI, shows this node as node-31. Yet, after Fuel provisions the Ubuntu OS and the host reboots cloud-init (i believe) is setting the old node-20 hostname, and using many settings from node-20 (ie: RSA SSH keys).

I believe this is causing Fuel not to be able to communicate with the node and the deployment times out with #<runtimeerror: could="" not="" find="" any="" hosts="" in="" discover="" data="" provided="">

[fuel2 node show 31
| id | 31 |
| name | Untitled (68:58) |
| status | ready |
| os_platform | ubuntu |
| roles | [u'compute'] |
| kernel_params | None |
| pending_roles | [] |
| hostname | node-31 |
| fqdn | |
| platform_name | ProLiant BL460c Gen9 |

on this broken node-31 I find these files/values in /var/lib/cloud

root@node-20:/var/lib/cloud# grep -R node-20 *

Binary file instance/obj.pkl matches
Binary file instances/nocloud/obj.pkl matches

please let me know what in Fuel 10 is setting these values? is it cloud-init? And how can I intervene and avoid it being set incorrectly?

If I try to delete node-20 I get an error showing it doesn't exist from fuel/2's perspective:

fuel node --node-id 20 --delete-from-db --force
404 Client Error: Not Found for url: (NodeCollection not found)
fuel2 node undiscover -n 20 -f
404 Client Error: Not Found for url: (Node not found)

I guess this has something to do with this file or one similar:

in that file I see
hostname: {{ common.hostname }}
fqdn: {{ common.fqdn }}

I assume the node's UUID or MAC address is already in some Fuel database associated with node-20, and these files are being populated incorrectly because of that.

I've tried many different things to get passed this ... (more)

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2017-11-20 15:14:12 -0800

jokken gravatar image

I have gotten around this issue. I'm not sure how I got around this issue, but my theory is something I noticed quite by accident.

The servers I was having these troubles on use 80GB hard drives, but they also have a flash drive in them, for small OS deployments. I stumbled across a /dev/sda6 partition on these flash drives.

On this partition I found 2 files: meta-data and user-data. In those files was the old node name I mentioned in the original post. This partition must have been detected, and these stale files were used by fuel-agent when the newly provisioned node first booted, even though they were booting from the 80GB drive which had its own /dev/sda6 partition.

I suspect the stale/incorrect /dev/sda6 is probably is from Fuel 8 when we tried to deploy on some of these flash drives...

once I deleted the stale/incorrect /dev/sda6 then the provisioning and deployment went perfectly

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

[hide preview]

Question Tools

1 follower


Asked: 2017-10-11 10:31:50 -0800

Seen: 27 times

Last updated: Nov 20 '17