Wednesday 21 November 2007

Erm..... what the F**k - Why 4 days to respond?

Remember - according to the previous post I DID receive a DHCPACK, so that wasn't the problem.... so why does this email start to really concern me that 1and1 don't have a clue? Also remember earlier that 1and1 told me there was nothing wrong with the RAID........??


"Thank you for contacting us.
If DHCP is working properly now, it's likely this was a temporary issue.
I'm not aware of any ongoing issues to this extent.
The message you received about the raid simply means the raid had to reconstruct some data, this may have occurred if the system was shutdown improperly and thus some data hadn't been synchronized between the two drives. This does not in and of itself mean that a hard drive has failed, as this would present an entirely different set of errors.
If you have any further questions please do not hesitate to contact us.
--
Sincerely,
J***** B*********

Technical Support


1&1 Internet"

Saturday 17 November 2007

Where's my bloody light-sabre

I can't beleive this - nobody seems to be able to read my emails correctly.....


"The DHCPACK did not happen until the server was rebooted - as you will see from the details shown at the bottom of this email, a DHCPACK was not received on NOV11 04:00 onwards - it was only after I initialised a reboot that a DHCPACK was received....
Last Sunday the Server became unresponsive and had to be forced a reboot form the Control Panel.
Kind regards


Netwarriors"

All servers have logs, and most logs have dates and times on them. How they managed to not see that a DHCPACK was not received amazes me. Nor were they able to comprehend that the DHCPACK only worked after the server was rebooted - obvious an underlying issue with 1and1.

Wow a response in less than a week

Couldn't believe this, I get a response in less than 3 days:


"Thank you for contacting us.

When was the last time the server shutdown unexpectantly?
I do not see anything indicative of a hard disk failure in the logs.
Do you get any errors when working with the file system?
If the server continues to reboot then we can have the hardware replaced.
I do see an ack from the DHCP server.
DHCPACK from 87.106.137.249
If you have any further questions please do not hesitate to contact us.
--
Sincerely,
B**** E***

Technical Support
1&1 Internet"


I would like to ask you all a question - look at the log I submitted earlier in the post - where was the DHCPACK???? When I have I suggested the server keeps rebooting?

try it out, then ask for information

Why is it 1and1 didn't try and logon, but didn't want to deal with the issue:


"Password is the same as the initial password and hasn’t changed. The only thing is ROOT access is disabled via SSH, and will have to go through the serial console.

Kind regards


Netwarriors"

Finally a response from the Dark Side

Wow, I get a response, still not very helpful though as they haven't even tried to login with my account password - again a very slopey shouldered approach to support.......


"Thank you for contacting us.
Please reset the root password so we can futher investigate.
Reply back when the password has been reset to the value found in the
1&1 control panel.
If you have any further questions please do not hesitate to contact us.
--
Sincerely,
B**** E***

Dedicated Server Support
1&1 Internet"

Friday 16 November 2007

Surprise - no response

Again, no response within 24 hours, so had to chase them up:


"I still haven't had a response to this email below:
------------------------------------------------
So can you explain this entry:
md: md6: raid array is not clean -- starting background reconstruction
Please also explain why there is no DHCPACK:
Nov 11 04:44:23 s15278325 dhclient: DHCPREQUEST on eth0 to
87.106.137.249
There must be a reason that the server stops responding, it's your Fedora Core 6 minimal image....and your hardware.
So please explain why the serial console stopped working.....
Kind regards


Netwarriors"

Thursday 15 November 2007

Return of the "Computer says no...."

Absolutely astonished at their previous response:


"So can you explain this entry:


md: md6: raid array is not clean -- starting background reconstruction

Please also explain why there is no DHCPACK:
Nov 11 04:44:23 s15278325 dhclient: DHCPREQUEST on eth0 to
87.106.137.249


There must be a reason that the server stops responding, it's your Fedora Core 6 minimal image....and you hardware.


So please explain why the serial console stopped working.....

Kind regards

Netwarriors"

The Phillipines strike back.......

This was the beginning of my frustration. As part of my account with 1and1, I am supposed to have 24/7 support, including 24 hour reponse to email related tickets...

Why is it then 3 days have passed before receiving a response:


"Thank you for contacting us.
I performed a memory and hard drive test on your server though found no errors at this time. Those messages below are not actually any sort of errors with the raid, what you are seeing is the raid autodetection process as the system looks for partitions with matching UUID's to enable as part of the raid.
If you have any further questions please do not hesitate to contact us.
--
Sincerely,
J***** B*********
Technical Support
1&1 Internet
---------------"

Monday 12 November 2007

In a galaxy far, far away........

I took out a 'Root Server I' package with 1and 1. A very respectfully priced package with a server specification that could not be matched on the Internet. This coupled with 'unlimited bandwidth' made it a very attractive package that was untouchable.... Little did I no that UNTOUCHABLE was exactly what my server at 1and1 became.

The server specification included a Serial Console connection that enables the user access to the server should a serious mistake be made and you accidentally lock yourself out of your server and SSH no longer will allow your connections. The Serial Console connects the user directly a console connection on the server that is independent of SSH and is effectivley the same as working on the server locally in 'text console' mode.

This is vital for a development server as quite often a simple mistake can prevent the user from gaining access, such as a poorly written script could use up a large % of the CPU time and therefore there either isn't the RAM or CPU available to open a new SSH connection to the server..... The serial Console isn't effected by this, and it is possible on 90% of occasions to logon to the server and terminate the problematic script.

After 3 days of taking out my new server, selecting the Linux Image I required - Fedora Core 6 (x64), my server just stopped responding. Originally I thought this was a bit strange but didn't think much of it and decided to logon via the Serial Console........

Oh S**t, the Serial Console didn't work either. I logged on to my 1and1 control panel, and rebooted the server.........

After looking through the logs I noticed a number of concerning things, and sent the following to 1and1:


"I suspect that there is a hardware error with a new server package that I took out with 1and1. After about 3 days of non-activity the server becomes unresponsive. It is built with Fedora Core 6 minimal and the only things added to it were Apache, MySQL and PHP.

Logging in via Serial Console gives nothing and the server has to be rebooted from the 1and1 control panel before I get telnet access.

Dmesg shows what appears to be a failing RAID:

md: md6: raid array is not clean -- starting background reconstruction
raid1: raid set md6 active with 2 out of 2 mirrors
md: considering sdb5 ...
md: adding sdb5 ...
md: sdb1 has different UUID to sdb5
md: adding sda5 ...
md: sda1 has different UUID to sdb5
md: resync of RAID array md6
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 128k window, over a total of 4891648 blocks.
md: created md5
md: bind
md: bind
md: running:
raid1: raid set md5 active with 2 out of 2 mirrors
md: considering sdb1 ...
md: adding sdb1 ...
md: adding sda1 ...
md: created md1
md: bind
md: bind
md: running:
raid1: raid set md1 active with 2 out of 2 mirrors
md: ... autorun DONE.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 380k freed
md: Autodetecting RAID arrays.

It would also appear that there is a problem with you DHCP server as no DHCP request is acknowledged as per /var/log/messages:

Nov 11 04:07:06 s15278325 syslogd 1.4.1: restart.
Nov 11 04:44:23 s15278325 dhclient: DHCPREQUEST on eth0 to 87.106.137.249 port 67
Nov 11 04:45:07 s15278325 last message repeated 4 times
Nov 11 04:46:19 s15278325 last message repeated 5 times
Nov 11 04:47:34 s15278325 last message repeated 4 times
Nov 11 04:48:50 s15278325 last message repeated 5 times
Nov 11 04:49:55 s15278325 last message repeated 4 times
Nov 11 04:51:05 s15278325 last message repeated 4 times
Nov 11 04:52:10 s15278325 last message repeated 5 times
Nov 11 04:53:13 s15278325 last message repeated 6 times
Nov 11 04:54:16 s15278325 last message repeated 4 times
Nov 11 04:55:23 s15278325 last message repeated 7 times
Nov 11 04:56:43 s15278325 last message repeated 6 times
Nov 11 04:57:48 s15278325 last message repeated 5 times
Nov 11 04:58:51 s15278325 last message repeated 6 times
Nov 11 04:59:52 s15278325 last message repeated 4 times

This is the third time this has happened to this server and would appreciate someone looking into it. It is not yet a current production server so if 1and1 need to carry out reboot and testing on the server to check the hardware then this will be fine as it will not affect any of our services. I finish work on Friday and by the time I log on Monday morning the server hardware is non-responsive again.

This needs to be rectified ASAP as this is a development server for a large World Wide Record Company and this will be holding up their development for their global website.

Kind regards


Netwarriors"

Sunday 11 November 2007

The Beginning

Like many other entrepreneurs on the Internet, we have ideas that require tools. Like many tools there are good quality tools, and there are some very, very poor tools.

I required a hosting company, like many others, but because it was just an 'idea' I didn't want to spend enormous amounts of money and wanted to just try it and see how it went.

I did, for my sins, decide to host a server with 1and1 Internet, a choice I wish I had never made. Having been in the IT industry for 15 years, I have seen good hosting, and I have seen some very, very bad hosting....... I will leave you to read through this blog to decide which category 1and1 fall in to.

This blog contains factual information with regards a fault I had experienced with a server I had with 1and1. This information is not derogatory, it is not embellished in anyway. This blog accurately records the tribulations I have suffered whilst dealing with this company.