| |
All General Major Service Outage Posts
| Web sites down - Closed |
17 Feb 07:57:04 |
Details 17 Feb 06:51:33 |
Our web site, and hosted customer web sites are not responding. We are looking in to this.
|
Update
17 Feb 07:32:53
|
This issue is the same as the one affecting email, and is related to a disk server.
|
Update
17 Feb 07:43:06
|
We are working on disk server now
|
| Started |
17 Feb 06:22:30 |
Closed 17 Feb 07:57:04 |
We sites now working. |
| Routing blip - Closed |
11 Feb 12:53:12 |
Details 11 Feb 12:40:11 |
We've just had a blip causing ADSL lines to drop of and reconnect, this also affected routing for about a minute.
More info to follow shortly.
|
Update
11 Feb 12:53:12
|
The issue affected both BGP links and LNS traffic, cauing lines to reconnect. Not all lines and services were affected. The lines have reconnected quickly, and the problem seems to have cleared without any need for any intervention from staff.
We are investigating the cause - this looks to be something external that has triggered this. We are trying to find exactly how this has happened so we can avoid the issue in future.
|
| General Users Affected |
33% |
| Closed |
11 Feb 12:53:12 |
| HEX 6/7 issues - Closed |
26 Jan 04:59:58 |
Details 26 Jan 04:56:46 |
We have lost access to everything in HEX 6/7 which means L2TP only customers are off line. This also affects some of our general servers. This is being investigated now.
|
Update
26 Jan 04:57:41
|
The issue is a result of a port/link failure earlier. We are re-configuring access via an alternative route now.
|
| Started |
26 Jan 04:31:00 |
Closed 26 Jan 04:59:58 |
Access has been restored |
| Web sites down - Closed |
09 Apr 2012 17:00:43 |
Details 09 Apr 2012 16:23:00 |
Our main web pages and customer web pages are currently inaccessable or very slow due to a denial of service attack. We are taking steps to address this. Sorry for any inconvenience.
|
Update
09 Apr 2012 16:26:34
|
This appears to be an attach on one web site, which is being moved.
|
Update
09 Apr 2012 16:31:17
|
Load is starting to move to the backup server now.
|
Update
09 Apr 2012 16:44:32
|
We have moved a lot of the load over, and adjusted TCP settings to try and ensure we sites are working albeit slightly slow.
|
Update
09 Apr 2012 16:59:12
|
Web access is being much more responsive now.
|
| Started |
09 Apr 2012 14:36:19 |
| Closed |
09 Apr 2012 17:00:43 |
| DNS Problem with aa.net.uk / aaisp.net.uk - Closed |
14 Feb 2012 13:06:19 |
Details 13 Feb 2012 12:37:17 |
We currently have a DNS problem with our main domains aa.net.uk and aaisp.net.uk - this will cause a problem with various services that we run.
|
Update
13 Feb 2012 12:45:36
|
This was a simple planned change - it seems however we have been caught out by bind refusing to reload if one zone has one syntax error. Seems one of our customer zones had a typo in it which our tools had not managed to check and that caused it to refuse to load.
Unfortunately it took quite a few minutes to find what was wrong as this was nothing to do with the changes we actually made.
|
Update
13 Feb 2012 12:55:05
|
Most things are OK.
|
Update
13 Feb 2012 13:13:38
|
Both authorative serves are serving the aaisp.net.uk and aa.net.uk zones correctly.
|
Update
13 Feb 2012 13:39:09
|
There are still problems with the aaisp.net.uk domain - we're working on this.
VoIP is currently affected too, this is being worked on now.
|
Update
13 Feb 2012 13:56:00
|
We are running in to slightly unexpected errors as well, and working through them. Some things that we have not touched are not working, which kind of makes no sense.
|
Update
13 Feb 2012 14:43:27
|
Whilst most things are working, at least from customer lines, there was an issue which caught us out, but was way to far in to the process to sensibly back up. The top level delegation for aaisp.net.uk was going to a special DNS server, which meant when we moved everything it stopped working properly.
|
Update
13 Feb 2012 14:46:17
|
Some incoming VoIP is not working, we are still working on this.
|
Update
13 Feb 2012 15:09:53
|
Still lots of progress mopping up things. Mostly non customer affecting.
Just to clarify - you should be able to use aaisp.net.uk or aa.net.uk as they interchangable. However, right now, some places are having some issues seeing some of the aaisp.net.uk sub domains.
The preferred version is now aa.net.uk and we should be quoting that everywhere now.
Using aa.net.uk is also a work around for the issues some people are seeing right now, which are down to DNS caches.
|
Update
13 Feb 2012 16:18:42
|
At the moment we still have some services affected, these are: some incoming VoIP and accessing our services from outside our network (ie customer using DNS resolvers ither than ours)
|
Update
13 Feb 2012 17:38:56
|
We really think this should be sorted now - but monitoring carefully.
|
Update
13 Feb 2012 18:25:30
|
Mor details on http://aa.net.uk/news-2012-02-dns.html
|
Update
14 Feb 2012 13:06:34
|
We think all is OK now, so closing this incident.
|
| Started |
13 Feb 2012 12:15:49 |
| Closed |
14 Feb 2012 13:06:19 |
| Mail and web server issues - Closed |
22 Jan 2012 08:32:57 |
Details 22 Jan 2012 07:57:48 |
It looks like we have a major issue with mail and web services.
This is being investigated now.
|
Update
22 Jan 2012 08:04:01
|
This looks like an issue with the main storage array used by web and email. Being worked on now.
|
Update
22 Jan 2012 08:31:56
|
Looks like we have it working again - web pages are fine - email being a bit sluggish catching up.
|
| Started |
22 Jan 2012 06:45:00 |
Closed 22 Jan 2012 08:32:57 |
The disk server has been restarted to clear the problem. The underlying cause of the problem is being investigated. |
| Routing Blip - Closed |
11 Nov 2010 11:59:24 |
Details 11 Nov 2010 11:53:30 |
Routing blipped briefly - we're investigating.
|
Update
11 Nov 2010 12:00:39
|
Routing is back now.
Total downtime was from about 11:45 to 11:53
We're investigating the cause of this problem.
|
| Started |
11 Nov 2010 11:50:28 |
| Closed |
11 Nov 2010 11:59:24 |
| Outage in RedBus HEX - Closed |
26 Oct 2010 23:06:25 |
Details 26 Oct 2010 20:57:35 |
We are seeing problems with equipment located in our HEX datacentre, most likely power related.
This will be affecting A&A webspace, customer L2TP connections, beta tester data SIM connections and some customer equipment which is hosted there.
We have been in contact with the data centre staff, who are investigating it.
|
Update
26 Oct 2010 21:32:34
|
The problem doesn't look like power, the core routers we have there appear to be rebooting.
Power cycling has restored the service to one of the routers, so access to servers hosted there is working again now. However the second router remains offline and is still causing problems with data SIM testers.
We are working to resolve that problem too.
|
Update
26 Oct 2010 21:33:04
|
The main issues have been resolved now - it does not look like power, but trying to find why two separate routers developed problems at the same time in HEX. Data SIMs will be affected still.
|
Update
26 Oct 2010 22:09:33
|
We have re-routed the data SIMs and set up backup for future use anyway.
|
| Started |
26 Oct 2010 20:38:00 |
Closed 26 Oct 2010 23:06:25 |
Both routers working now. |
| Major routing issue - Closed |
22 Sep 2010 09:30:00 |
Details 22 Sep 2010 08:50:08 |
Investigating now - if this is the same as we had at the weekend we should be able to sort it quite quickly.
|
Update
22 Sep 2010 08:58:53
|
We hope to have this sorted in a few minutes.
|
Update
22 Sep 2010 09:03:14
|
This is impacting some VoIP services but not all.
|
Update
22 Sep 2010 09:11:09
|
There will be a slight blip on broadband while we sort this.
|
Update
22 Sep 2010 09:13:04
|
This looks like some issue with routing through LINX. We may take down the route collector peering until we are happy we have identified the cause.
|
Update
22 Sep 2010 09:16:25
|
Still seeing some issues.
|
Update
22 Sep 2010 09:21:03
|
Equipment reboot worked briefly and then the problem re-occured. It seems clear this is a routing issue with a peer that is causing a black hole. We do not understand exactly where or how yet and this is being addressed.
|
Update
22 Sep 2010 09:35:38
|
We have taken town LINX route server peering and things are looking a lot better - checking things now.
|
Update
22 Sep 2010 09:46:39
|
It may be worth explaining this a little. We have dual redundent equipment to allow for failures. If something fails completely, or can be turned off, then the systems re-route to use other equipment. Depending on where such issues are this can mean no outage, a few seconds or a few minutes.
However, if there is a partial failure, such as a single black-hole route for the link to Maidenhead, then this is not an equipment failure. The other routers get that route and expect it to be valid. This can create complex problems that are hard to diagnose, also and mean we have to use various alternative means to access systems which causes delays.
|
| Started |
22 Sep 2010 08:37:00 |
Closed 22 Sep 2010 09:30:00 |
I would stress, just because taking down the LINX route server seems to have addressed the issue does not mean there is an issue with LINX. This could be something odd with our routers, or the LINX router server or a peer via that route server feeding something odd to us as a route. We're trying to identify what has happened but for now we'll leave the route server shutdown until we know. |
| Packet loss - Power Failure in Telehouse North - Closed |
21 Jul 2010 15:06:47 |
Details 21 Jul 2010 14:26:21 |
There is a general issue with packet loss at the moment.
It looks like it might be related to transit, and we're investigating.
|
Update
21 Jul 2010 14:42:46
|
Still investigating. Looks like a problem with peering.
Something's recovering - traffic levels are returning to normal.
|
Update
21 Jul 2010 14:43:52
|
Looks like a power failure in Telehouse North, which would have affected a lot of peering, and some transit.
|
Update
21 Jul 2010 15:07:17
|
It looks like everything is back to normal for us.
|
| Started |
21 Jul 2010 14:23:47 |
| Closed |
21 Jul 2010 15:06:47 |
| Routing via HEX broken - Closed |
27 May 2010 20:20:26 |
Details 27 May 2010 19:29:03 |
We have been trying to resolve this properly and run in to a further snag which means again we have no routing to servers in HEX. Broadband unaffected as before but many services like control pages, hosted servers, our web site, accounts, and so on are all off-line while we reoslve this.
|
Update
27 May 2010 19:44:13
|
We are waiting for someone on-site at present.
|
Update
27 May 2010 20:20:26
|
Ok, working again.
|
| Started |
27 May 2010 19:27:00 |
Closed 27 May 2010 20:20:26 |
Trying to find the exact problem for a permanent fix. |
| Router issue in HEX - Closed |
27 May 2010 17:26:33 |
Details 27 May 2010 17:13:39 |
We have a major issue with a firewall/router in HEX at present which will affect access to web pages, control pages, accounts system, and variosu systems. We hope to have this resolved ASAP.
|
| Started |
27 May 2010 17:16:37 |
Closed 27 May 2010 17:26:33 |
Routing has been reworked - some smaller issues remain. |
| Lost access to machines in HEX (including clueless) - Closed |
08 Apr 2010 08:30:52 |
Details 08 Apr 2010 01:35:23 |
We seem to have lost access to machines in Harbour Exchange Square.
This includes one of our key machines for RADIUS authentication for broadband lines (clueless) and our accounts server (priceless), customer web servers (Limitless) and a few hosted machines.
We are trying to identify the cause of the problem now.
|
Update
08 Apr 2010 01:46:48
|
We have managed to confirm it is not a power issue
|
Update
08 Apr 2010 01:56:38
|
We are gettign someone to check the rack in HEX now.
The backup RADIUS and DNS servers are working as they should.
|
Update
08 Apr 2010 02:05:02
|
This appears to be an issue with our main firewall in HEX. It is firewalling rather too well all of a sudden.
|
Update
08 Apr 2010 02:12:13
|
We are just waiting on a reboot now. We believe we have found the cause of the issue though so can take some preventative measures once the reboot is complete.
|
Update
08 Apr 2010 02:37:29
|
Reboot complete - all working
|
Update
08 Apr 2010 08:30:52
|
One slight side effect - session tracking timed out as the RADIUS server could not see the LNS. This meant PPP restarts for most lines at some point during the night and a few may do so during the day (on the hour). Until then usage is not being meter for those lines.
|
| Started |
08 Apr 2010 00:59:10 |
Closed 08 Apr 2010 08:30:52 |
We will be applying an update to the firewall shortly |
|
|