How Do We Know What is a Suspicious URL?

4 03 2011

The question was asked on the SLUniverse thread-that-never-dies as to how we can use the forthcoming media patch to tell that a URL is suspicious. How does the average user of SL know that one URL is a valid music stream and another looks like an IP data collection service. Clearly we can suck it and see – allow the URL and see if we actually get music, but then by the time we realise the URL is not music, we have already been recorded in some basement guy’s database.

One sugegstion was to look for & characters, and I was about to write a reply here that made the point that zFire’s updated Redzone URLs use a new (equally crackable) mono alphabet substitution cipher to obfuscate even the “&” characters, so that is not going to work. I was gearing up to write a “how to” answer to the question, when I found that Psyke Phaeton had answered the question fully on the above thread on SLUniverse.

I am shamelessly copying Psyke’s reply here because it would be a pity if it got less attention for being lost in that 8000+ post thread.

Psyke Phaeton said:

The main aim of the bad people is to send information about you to their server. Therefore the longer the URL the more chance it contains your information.

For example:

http://111.111.111.111:8000/music

doesn’t obviously contain any extra information that looks like it might be data about you. But

http://111.111.111.111:8000/music?id=123

becomes more suspicious. Is 123 a way of tracking you or is 123 a music selection from a larger collection? We don’t know.

http://111.111.111.111:8000/music?m=2342hdd922adattaaaa8syd7stdfssfff&x=122dgf r
or
http://111.111.111.111:8000/music/2342hdd922adattaaaa8syd7stdfssfff

become even more suspicious. What is that data on the end of these?

You arent looking for ? and & but long strings of letters and symbols which might be obscuring data about you.

The longer the sequence of the URL after the first single / the more information it is potentially sending and therefore the more suspicious you should become.

If I look at the current URL for my post I am doing I see:

http://www.sluniverse.com/php/vb/newreply.php?do=newreply&p=1175291

This data on the end makes sense. I am doing a new reply and the post is number 1175291. I can therefore trust this. If I go to YouTube and play a video I see..

http://www.youtube.com/watch?v=NLmsiaN5dZM&feature=topvideos

This makes sense also. I am playing video NLmsiaN5dZM and I used the feature topvideos.

The question you ask is a) Is the length of the URL suspiciously long and b) Does data in the URL make sense or does it look suspiciously obscured.

Did I make sense? Its 4am here

The good news is that Linden Lab is going to allow us to see the full URLs and make informed decisions on what we see. The bad news is that there are still far too many people who will just ignore the security warnings because they don’t understand them. But we can console ourselves that never again can spyware like RedZone be completely hidden from view by the spyware operators.

Advertisements

Actions

Information

7 responses

5 03 2011
Jenni Darkwatch

Figuring out what a suspicious URL looks like is going to be fairly hard programmatically. It’d need some system like Web-of-Trust or the like to do that.

Here’s why:
1. Write a LSL script with a local buffer. Each avi has a position in that buffer. Transmit buffer number+avi key to sneaky server.
2. Have a wildcard domain, i.e. *.sneaky.guy, point to the server.
3. For each avi, set the media URL to something innocuous-looking, say stream47.sneaky.guy. The number would be the buffer number of course. PHP can extract the host part easily, and correlate it with entries from the visitor list.

I obviously left out a bunch of important things, but the bottom line is: It’s practically impossible to programmatically determine what’s suspicious and what isn’t. In fact, an URL like that would probably even fool many reasonably experienced users.

5 03 2011
Voff Uggla

Thanks for the info and the first reply, really depressing though, SL is NOT fun anymore after finding out that zFire have done, what he have done, for more than a year. I don’t trust anyone or anything in SL anymore, this might be the end of SL for me. 😦

5 03 2011
Jenni Darkwatch

Don’t be depressed. If you think about it, it’s no different than RL. You have to decide who to trust, in RL as well as SL.

5 03 2011
no2redzone

Jenni, you are right. Given enough creativity, it is possible to make these things very hard to spot. We do not have to worry about zFire doing it as he does not seem to have enough imagination or technical knowhow to do so. There is also the point that the dead giveaway for most of these systems is that they give a IRL but no content, so in such cases, users will immediately become aware something is up. There will be no more sneaky data collection until the database is large enough to cause serious griefing before the majority of people notice it (and yes, I know the first people to notice rezone started talking about it maybe a year or more ago).

5 03 2011
Jenni Darkwatch

I’ve thought high and low about how to stop things like that. From a technical perspective I think transparency helps, but in the end it’s up to the individual user. Pretty much the same as with webpages or really anything else on the ‘net.

I honestly think the best LL can do is to ditch the bogus “hidden” media and really just make it more transparent. Even if LL camouflages them, they’re ridiculously trivial to unmask outside of the SL client.

Just to talk out of my ass for a bit: LL has long neglected privacy in SL. Not just the privacy of its users but also “privacy” of any creator. I know it’s impossible to stop content theft, that’s the nature of the Internet. But LL has never really given us any tools to fight griefing, harrassment or stalking. Parcel bans are a joke, as are many so-called privacy settings. Even ancient IRC had better tools. If a script can reliably detect any random users online status, what’s the point of the privacy setting in preferences? And why is there no way to hide group membership from other group members? Why is it possible to cam into a parcel if the parcel has ban lines up? All these issues are low-hanging fruit, from a programmers point of view relatively easy to fix.

People do get more aware of privacy issues these days. Whether LL will actually do anything to improve privacy is more than doubtful, IMO. Their non-existing sense of privacy is what brought us griefing, stalking and of course RZ and its ilk. All just my opinion, of course.

5 03 2011
Azure Twine

All through history communities have relied on people “doing the right thing”. We know that doesn’t work out so well so laws are made, terms of service, community standards. In my opinion, there is just so much that can be done form a technical standpoint. The community needs to start taking some responsibility too.

Unfortunately we have no real power and Linden Lab does not respond quickly or strongly enough.

5 03 2011
annotoole

While I applaud LL for and am grateful for the apparent change in attitude at LL the issue of the root cause of the privacy invasion has not been addressed. Please go watch and comment on https://jira.secondlife.com/browse/VWR-9236 and try to get LL to close this exploit. Then we can look at urls that are no longer masked and study them. The urls being used for privacy invasion are not visible in viewer land options. You must use the media interceptor/filter to see those. If LL closed the exploit then we would still want the visible urls and media interceptor/filter but the main means of stealth collection and thus the point of the spyware systems becomes moot.




%d bloggers like this: