CEOP show Maddie is missing on 30th April 2007
+30
Jellybot
Guinea Pig
Stewie
Mo
Admin
End
Nuala
wjk
Bampots
dantezebu
Châtelaine
Poppy
Mimi
Dee Coy
TheTruthWillOut
bluebell
froggy
Bubblewrapped
PeterMac
Burst
AndyB
Freedom
Andrew
candyfloss
Poe
chirpyinsect
Popcorn
dogs don't lie
costello
Magnum
34 posters
Page 20 of 40
Page 20 of 40 • 1 ... 11 ... 19, 20, 21 ... 30 ... 40
Re: CEOP show Maddie is missing on 30th April 2007
Unfortunately not. What its saying is that it doesn't always archive a whole site. That means that there won't necessarily be a target page for some of the links that are archived on a particular day. For example WBM might archive this site today but not include the member list page. If you then retrieve this site from WBM in six months time you will get web.archive.org/web/20150621162306/maddiemccannmystery.forumotion.co.uk. If you then click on the member list link at the top of the page it will try to load web.archive.org/web/20150621162306/maddiemccannmystery.forumotion.co.uk/memberlist but it won't be there. Instead of saying page not found it looks for a memberlist in one of the other archives from around the same time, say web.archive.org/web/20150320152648/https://maddiemccannmystery.forumotion.co.uk/memberlist.Hongkong Phooey wrote:Does this not answer our question, if not why not?
How did I end up on the live version of a site? or I clicked on X date, but now I am on Y date, how is that possible?
Not every date for every site archived is 100% complete. When you are surfing an incomplete archived site the Wayback Machine will grab the closest available date to the one you are in for the links that are missing. In the event that we do not have the link archived at all, the Wayback Machine will look for the link on the live web and grab it if available. Pay attention to the date code embedded in the archived url. This is the list of numbers in the middle; it translates as yyyymmddhhmmss. For example in this url http://web.archive.org/web/20000229123340/http://www.yahoo.com/ the date the site was crawled was Feb 29, 2000 at 12:33 and 40 seconds.
You can see a listing of the dates of the specific URL by replacing the date code with an asterisk (*), ie: http://web.archive.org/*/www.yoursite.com
Whatever archives we have are viewable in the Wayback Machine. Please note that there is a 6 - 14 month lag time between the date a site is crawled and the date it appears in the Wayback Machine.
The issue with the CEOP home page is that the page itself has links to news articles that didn't exist on 30/04/2007
AndyB- Posts : 675
Join date : 2014-09-20
Re: CEOP show Maddie is missing on 30th April 2007
AndyB wrote:
I would expect it. The date is a constant (at least until it changes at midnight) and the machine either thinks its 25/10/2007 or it thinks its 30/04/2007. It's the same hardware and software that archives other sites and it doesn't seem credible that it should flip from one date to another just for the couple of microseconds that it archived the CEOP site. As for human involvement, I'm not sure what sort of thing you're thinking of but if its something like an operator getting a parameter wrong then I would expect that same thing - lots of other sites with the same issue.
And as I said, I wouldn't expect it, I have no reason to expect it, as we don't know what caused the incorrect output. You're working on the assumption that server clock was wrong. I'm not. I'm working on the information that there is incorrect output, and I don't have the resources to discover what caused it.
Last edited by WLBTS on Sun 21 Jun 2015, 4:39 pm; edited 1 time in total
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
I agree, although I have very much enjoyed the analysis and discussion about the possibilities despite the lack of evidencefroggy wrote:AndyB wrote:I think both sides of the debate accept that everything worked correctly on 30/04/2007. The issue is whether the CEOP archive really did get archived on that day or whether it was actually archived in late October and misfiled as 30/04/2007chirpyinsect wrote:The WB seemed to be working correctly at 12.09.52 on 30 Apr 2007 for a capture of Youtube.
http://web.archive.org/web/20070430120952/https://www.youtube.com/
In which case there will need to be some proper evidence to that effect and not just an opinion that this is what must have happened.
AndyB- Posts : 675
Join date : 2014-09-20
Re: CEOP show Maddie is missing on 30th April 2007
I'm not assuming the server clock was wrong. For all I know the page was really archived on 30/04/2007 and there is no incorrect output other than the links to the wrong news articles. What I'm saying is that I expect the same code running on the same server to archive all sites the same way. I expect consistency in something as basic as the date. Even if the date is manipulated before being output, the only way it could be different for one site and one site only is if the data being archived interacts with the date prior to it being output but that makes no sense at allWLBTS wrote:AndyB wrote:
I would expect it. The date is a constant (at least until it changes at midnight) and the machine either thinks its 25/10/2007 or it thinks its 30/04/2007. It's the same hardware and software that archives other sites and it doesn't seem credible that it should flip from one date to another just for the couple of microseconds that it archived the CEOP site. As for human involvement, I'm not sure what sort of thing you're thinking of but if its something like an operator getting a parameter wrong then I would expect that same thing - lots of other sites with the same issue.
And as I said, I wouldn't expect it, I have no reason to expect it, as we don't know what caused the incorrect output. You're working on the assumption that server clock was wrong. I'm not. I'm working on the information that there is incorrect output, and I don't have the resources to discover what caused it.
AndyB- Posts : 675
Join date : 2014-09-20
Re: CEOP show Maddie is missing on 30th April 2007
AndyB wrote:
I'm not assuming the server clock was wrong. For all I know the page was really archived on 30/04/2007 and there is no incorrect output other than the links to the wrong news articles. What I'm saying is that I expect the same code running on the same server to archive all sites the same way. I expect consistency in something as basic as the date. Even if the date is manipulated before being output, the only way it could be different for one site and one site only is if the data being archived interacts with the date prior to it being output but that makes no sense at all
Aye, but we're dealing with a particular time-stamp - 30th April 2007, 11:58:03 - not an entire day. My working theory is that it is that time-stamp only that is in error, that something caused the calculation of the time-stamp to be in error. If it did happen then we don't know exactly what caused it, not until WBM can explain. It could have been a boundary case in the code the produced an incorrect time-stamp from input data. There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
Over on CMoMM rustyjames has figured out how to do a web search:
http://web.archive.org/cdx/search/cdx?url=www.ceop.gov.uk&matchType=prefix&gzip=false&from=20070430&to=20070430
Witness for yourself the very many articles that are recorded under the 30th April time-stamp that are dated later in time.
Thanks for that rustyjames, that's good work.
http://web.archive.org/cdx/search/cdx?url=www.ceop.gov.uk&matchType=prefix&gzip=false&from=20070430&to=20070430
Witness for yourself the very many articles that are recorded under the 30th April time-stamp that are dated later in time.
Thanks for that rustyjames, that's good work.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
You just beat me to it. That's fairly conclusive to me - the 30/04/2007 archive now looks like it was taken sometime after 06/12/2007
AndyB- Posts : 675
Join date : 2014-09-20
Re: CEOP show Maddie is missing on 30th April 2007
WLBTS wrote:AndyB wrote:
I'm not assuming the server clock was wrong. For all I know the page was really archived on 30/04/2007 and there is no incorrect output other than the links to the wrong news articles. What I'm saying is that I expect the same code running on the same server to archive all sites the same way. I expect consistency in something as basic as the date. Even if the date is manipulated before being output, the only way it could be different for one site and one site only is if the data being archived interacts with the date prior to it being output but that makes no sense at all
Aye, but we're dealing with a particular time-stamp - 30th April 2007, 11:58:03 - not an entire day. My working theory is that it is that time-stamp only that is in error, that something caused the calculation of the time-stamp to be in error. If it did happen then we don't know exactly what caused it, not until WBM can explain. It could have been a boundary case in the code the produced an incorrect time-stamp from input data. There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out.
Why not? Surely if there was a fault as you suggest, it could have lasted for any length of time.
froggy- Posts : 747
Join date : 2015-06-17
Re: CEOP show Maddie is missing on 30th April 2007
Now let's try a search across a whole month, 14th May 2007 - 14th June 2007:
http://web.archive.org/cdx/search/cdx?url=www.ceop.gov.uk&matchType=prefix&gzip=false&from=20070514&to=20070614
No articles here from the future.
http://web.archive.org/cdx/search/cdx?url=www.ceop.gov.uk&matchType=prefix&gzip=false&from=20070514&to=20070614
No articles here from the future.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
froggy wrote:WLBTS wrote:
Aye, but we're dealing with a particular time-stamp - 30th April 2007, 11:58:03 - not an entire day. My working theory is that it is that time-stamp only that is in error, that something caused the calculation of the time-stamp to be in error. If it did happen then we don't know exactly what caused it, not until WBM can explain. It could have been a boundary case in the code the produced an incorrect time-stamp from input data. There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out.
Why not? Surely if there was a fault as you suggest, it could have lasted for any length of time.
Indeed. Surely 'any length of time' includes one second or less, it includes any length of time. I said that there is 'no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out'.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
WLBTS wrote:froggy wrote:WLBTS wrote:
Aye, but we're dealing with a particular time-stamp - 30th April 2007, 11:58:03 - not an entire day. My working theory is that it is that time-stamp only that is in error, that something caused the calculation of the time-stamp to be in error. If it did happen then we don't know exactly what caused it, not until WBM can explain. It could have been a boundary case in the code the produced an incorrect time-stamp from input data. There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out.
Why not? Surely if there was a fault as you suggest, it could have lasted for any length of time.
Indeed. Surely 'any length of time' includes one second or less, it includes any length of time. I said that there is 'no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out'.
But surely that is merely your opinion in support of suit your argument, than a verified fact. I find the likelyhood of this happening only to the McCann file on a CEOP site unbelievable.
froggy- Posts : 747
Join date : 2015-06-17
Re: CEOP show Maddie is missing on 30th April 2007
This is getting silly now. Even if it gets an incorrect time-stamp from input data, and I can't see why it would want to do that rather than use the machine date, it still has the same input data for the duration that its running. So unless the input data that generates the date changes for the couple of milliseconds that its archiving the CEOP site then there are going to be many more sites that are affected in the same way.WLBTS wrote:AndyB wrote:
I'm not assuming the server clock was wrong. For all I know the page was really archived on 30/04/2007 and there is no incorrect output other than the links to the wrong news articles. What I'm saying is that I expect the same code running on the same server to archive all sites the same way. I expect consistency in something as basic as the date. Even if the date is manipulated before being output, the only way it could be different for one site and one site only is if the data being archived interacts with the date prior to it being output but that makes no sense at all
Aye, but we're dealing with a particular time-stamp - 30th April 2007, 11:58:03 - not an entire day. My working theory is that it is that time-stamp only that is in error, that something caused the calculation of the time-stamp to be in error. If it did happen then we don't know exactly what caused it, not until WBM can explain. It could have been a boundary case in the code the produced an incorrect time-stamp from input data. There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out.
AndyB- Posts : 675
Join date : 2014-09-20
Re: CEOP show Maddie is missing on 30th April 2007
... and just for fun, let's try the range 29th March 2007 - 29th April 2007:
http://web.archive.org/cdx/search/cdx?url=www.ceop.gov.uk&matchType=prefix&gzip=false&from=20070329&to=20070429
No articles from the future here either.
http://web.archive.org/cdx/search/cdx?url=www.ceop.gov.uk&matchType=prefix&gzip=false&from=20070329&to=20070429
No articles from the future here either.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
froggy wrote:WLBTS wrote:froggy wrote:WLBTS wrote:
Aye, but we're dealing with a particular time-stamp - 30th April 2007, 11:58:03 - not an entire day. My working theory is that it is that time-stamp only that is in error, that something caused the calculation of the time-stamp to be in error. If it did happen then we don't know exactly what caused it, not until WBM can explain. It could have been a boundary case in the code the produced an incorrect time-stamp from input data. There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out.
Why not? Surely if there was a fault as you suggest, it could have lasted for any length of time.
Indeed. Surely 'any length of time' includes one second or less, it includes any length of time. I said that there is 'no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out'.
But surely that is merely your opinion in support of suit your argument, than a verified fact. I find the likelyhood of this happening only to the McCann file on a CEOP site unbelievable.
I said: 'There is no reason to assume that for a prolonged period of time invalid time-stamps were being chucked out'.
You said: 'Why not? Surely if there was a fault as you suggest, it could have lasted for any length of time.'
I answered that we can't just assume it is a prolonged period of time, it could be any amount of time, including a very small amount of time, or a very large amount of time. We can't assume what that amount of time is. Exactly as you said - 'any length of time'.
I can't see what this is about, perhaps an argument for argument's sake?
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
AndyB wrote:
This is getting silly now. Even if it gets an incorrect time-stamp from input data, and I can't see why it would want to do that rather than use the machine date, it still has the same input data for the duration that its running. So unless the input data that generates the date changes for the couple of milliseconds that its archiving the CEOP site then there are going to be many more sites that are affected in the same way.
Aye, perhaps you're not understanding me. I am proposing that every site that was archived under the time-stamp 30th April 2007 11:58:03 was done so in error, and that could be a large number of URLs.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
How does Wayback determine which sites to trawl ? Are sites selected randomly, or is there some pattern to ensure that all sites are covered over a certain period of time?
froggy- Posts : 747
Join date : 2015-06-17
Re: CEOP show Maddie is missing on 30th April 2007
From rustyjames info. it appears there are files past 30/04 being included and some that would belong there therefore not all time stamped 30/04 are wrong surely?WLBTS wrote:AndyB wrote:
This is getting silly now. Even if it gets an incorrect time-stamp from input data, and I can't see why it would want to do that rather than use the machine date, it still has the same input data for the duration that its running. So unless the input data that generates the date changes for the couple of milliseconds that its archiving the CEOP site then there are going to be many more sites that are affected in the same way.
Aye, perhaps you're not understanding me. I am proposing that every site that was archived under the time-stamp 30th April 2007 11:58:03 was done so in error, and that could be a large number of URLs.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
I'm obviously not understanding you because that's the complete opposite of what I understood you to say earlierWLBTS wrote:AndyB wrote:
This is getting silly now. Even if it gets an incorrect time-stamp from input data, and I can't see why it would want to do that rather than use the machine date, it still has the same input data for the duration that its running. So unless the input data that generates the date changes for the couple of milliseconds that its archiving the CEOP site then there are going to be many more sites that are affected in the same way.
Aye, perhaps you're not understanding me. I am proposing that every site that was archived under the time-stamp 30th April 2007 11:58:03 was done so in error, and that could be a large number of URLs.
WLBTS wrote:AndyB wrote:
I'm still struggling to understand how a crawl on say 25/10/2007 can end up believing that the date is actually 30/04/2007 but, because its a possibility lets for the moment say that's what happened. Wouldn't you expect there to be many thousands of other sites archived the same day with similarly incorrect timestamps?
Irrespective of whether mccann.html was crawled on 30/04/2007 or not I agree that there's probably no conspiracy here. Just a panicking IT manager who really doesn't want the whole world to realise just how flaky the WBM software really is. Did you get a reply from them BTW?
On the first point, no, I wouldn't automatically expect to see that. I've seen something that to me points at a bug causing an incorrect time-stamp, which may have any cause, hardware, software, or even simple human error. I can't comment on how widespread the result of the bug would be. We could only know that if we knew what caused this particular output that I am convinced is erroneous.
AndyB- Posts : 675
Join date : 2014-09-20
Re: CEOP show Maddie is missing on 30th April 2007
From rustyjames info. it appears there are files past 30/04 being included and some that would belong there therefore not all time stamped 30/04 are wrong surely?[/quote]Hongkong Phooey wrote:
Aye, perhaps you're not understanding me. I am proposing that every site that was archived under the time-stamp 30th April 2007 11:58:03 was done so in error, and that could be a large number of URLs.
The pages that you say 'belong there' could well have existed in April, and continued to exist until the likely December crawl that was stored under the April time-stamp. The fact that there are URLs in the April time-stamp that really shouldn't be there is the important point.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
AndyB wrote:I'm obviously not understanding you because that's the complete opposite of what I understood you to say earlierWLBTS wrote:AndyB wrote:
This is getting silly now. Even if it gets an incorrect time-stamp from input data, and I can't see why it would want to do that rather than use the machine date, it still has the same input data for the duration that its running. So unless the input data that generates the date changes for the couple of milliseconds that its archiving the CEOP site then there are going to be many more sites that are affected in the same way.
Aye, perhaps you're not understanding me. I am proposing that every site that was archived under the time-stamp 30th April 2007 11:58:03 was done so in error, and that could be a large number of URLs.WLBTS wrote:AndyB wrote:
I'm still struggling to understand how a crawl on say 25/10/2007 can end up believing that the date is actually 30/04/2007 but, because its a possibility lets for the moment say that's what happened. Wouldn't you expect there to be many thousands of other sites archived the same day with similarly incorrect timestamps?
Irrespective of whether mccann.html was crawled on 30/04/2007 or not I agree that there's probably no conspiracy here. Just a panicking IT manager who really doesn't want the whole world to realise just how flaky the WBM software really is. Did you get a reply from them BTW?
On the first point, no, I wouldn't automatically expect to see that. I've seen something that to me points at a bug causing an incorrect time-stamp, which may have any cause, hardware, software, or even simple human error. I can't comment on how widespread the result of the bug would be. We could only know that if we knew what caused this particular output that I am convinced is erroneous.
How so? The only proposition I have made is that the time-stamp 30th April 2007 11:58:03 was erroneous. I haven't stated that there *is* a large number of URLs affected, just that there *could be*. Just as there *could be* a small number of URLs. I said that I wouldn't 'automatically expect to see' something, not that I would 'absolutely not expect to see' something. All things are possible. I've stated that I do not know how long the error was in effect, and I have not stated that it was a long time nor a short time.
I'm not sure this is an argument either of us are really interested in, is it? If you'd like to move on I'd be more than happy to oblige. Truce?
Last edited by WLBTS on Sun 21 Jun 2015, 6:37 pm; edited 1 time in total
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
On the actual core debate, I feel that the penny has finally dropped, but it may take a long time to actually hit the floor.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
WLBTS wrote:From rustyjames info. it appears there are files past 30/04 being included and some that would belong there therefore not all time stamped 30/04 are wrong surely?Hongkong Phooey wrote:
Aye, perhaps you're not understanding me. I am proposing that every site that was archived under the time-stamp 30th April 2007 11:58:03 was done so in error, and that could be a large number of URLs.
The pages that you say 'belong there' could well have existed in April, and continued to exist until the likely December crawl that was stored under the April time-stamp. The fact that there are URLs in the April time-stamp that really shouldn't be there is the important point.[/quote]
The files rustyjames shows still have a filename with their proper time stamp I.ek,gov,ceop)/news_items/article_20070607_ceop.htm 20070708220101 (not the full address but shows the time stamp integrated). The pages we are debating have the time stamp of 30/04 do they not? I realise that the later ones are in the earlier folder as such
Last edited by Hongkong Phooey on Sun 21 Jun 2015, 6:48 pm; edited 1 time in total
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
Hongkong Phooey wrote:
The files rustyjames shows still have a filename with their proper time stamp I.ek,gov,ceop)/news_items/article_20070607_ceop.htm 20070708220101 (not the full address but shows the time stamp integrated). The pages we are debating have the time stamp of 30/04 do they not?
I can't tell what you're getting at HKP.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
WLBTS wrote:Hongkong Phooey wrote:
The files rustyjames shows still have a filename with their proper time stamp I.ek,gov,ceop)/news_items/article_20070607_ceop.htm 20070708220101 (not the full address but shows the time stamp integrated). The pages we are debating have the time stamp of 30/04 do they not?
I can't tell what you're getting at HKP.
Thanks for your patience!
Has rustyjames just demonstrated that because there are later files (than 30/04) then the original crawl is corrupt and it has added data when relaying the page or do you still think it is just a complete date error.
Guest- Guest
Re: CEOP show Maddie is missing on 30th April 2007
So let's go with "crawls from October are time stamped as April" for the moment (I'm still not convinced but let's go with it for the moment)
That's the whole bottom fallen out of waybacks reputation, right there
They will be able to tell all the owners of the other sites affected, won't they
They will be able to provide other examples
They will have to issue a major bug fix and update, because they can't risk it happening again, can they
And they will have to make details of that bug fix public on their site so people know it's been fixed
Let's just wait and see :0
That's the whole bottom fallen out of waybacks reputation, right there
They will be able to tell all the owners of the other sites affected, won't they
They will be able to provide other examples
They will have to issue a major bug fix and update, because they can't risk it happening again, can they
And they will have to make details of that bug fix public on their site so people know it's been fixed
Let's just wait and see :0
Guest- Guest
Page 20 of 40 • 1 ... 11 ... 19, 20, 21 ... 30 ... 40
Similar topics
» CEOP show Maddie is missing on 30th April 2007
» Madeleine McCann: Missing Maddie now 13 and looks like THIS
» CEOP Missing kids and Missing people seem to have lost the plot
» Maddie: anger at TV Leak -McCanns gutted by Maddie cop’s show: Bid to halt a UK version on web
» MADDIE TRIBUTE Kate McCann to lay presents in Maddie’s bedroom tomorrow in heartbreaking tribute to missing daughter on her 15th birthday
» Madeleine McCann: Missing Maddie now 13 and looks like THIS
» CEOP Missing kids and Missing people seem to have lost the plot
» Maddie: anger at TV Leak -McCanns gutted by Maddie cop’s show: Bid to halt a UK version on web
» MADDIE TRIBUTE Kate McCann to lay presents in Maddie’s bedroom tomorrow in heartbreaking tribute to missing daughter on her 15th birthday
Page 20 of 40
Permissions in this forum:
You cannot reply to topics in this forum