Android Question Access denied when downloading a website as text.

Filippo · Dec 14, 2023

Hi,

I am trying to download a website with my app, it always worked until recently.
Now this website refuses me access and I get this error message:

ResponseError. Reason: , Response: <HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http://www.finanzen.net/suchergebnis.asp?" on this server.<P>
Reference #18.c6011002.1702559562.75b01d0
</BODY>
</HTML>

Here is my code:

B4X:

    Dim strUrl As String
    strUrl = https://www.finanzen.net/suchergebnis.asp?frmAktienSucheTextfeld=DE0005313704"

    getWebSeiteAlsString(strUrl)


Sub getWebSeiteAlsString(sURL As String)
    Dim job As HttpJob
    job.Initialize("WebSeiteAlsString", Me)
    job.Download(sURL)
    ProgressDialogShow2("Bitte warten...", True)
End Sub

Sub JobDone(Job As HttpJob)
     Dim parser As JSONParser
    Dim res As String
    
'    Log("JobName = " & Job.JobName & ", Success = " & Job.Success)
    
    If Job.Success Then
        res = Job.GetString
        parser.Initialize(res)
        
        Select Job.JobName
            Case "WebSeiteAlsString"
                ...               
        End Select
    Else
        MsgboxAsync("Die Charts können nicht angezeigt werden.","Kein Internet verbindung!")
    End If
    Job.Release
    
    ProgressDialogHide
End Sub

Can this block be lifted? If yes, how?

aeric · Dec 14, 2023

Maybe you need to allow cookie or provide an API key?

By the way, why don't you use Wait For with OkHttpUtils2?

Filippo · Dec 14, 2023

aeric said:
Maybe you need to allow cookie or provide an API key?

This website, as far as I know, does not support Api-Key.

aeric said:
By the way, why don't you use Wait For with OkHttpUtils2?

Because it is basically the same.

Sandman · Dec 14, 2023

It's a simple case of blocking based on some data from the client. Probably user agent, or something like that.

Doesn't work, just as you posted:

Bash:

sandman@mothership:~ curl "https://www.finanzen.net/suchergebnis.asp?frmAktienSucheTextfeld=DE0005313704"
<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>
 
You don't have permission to access "http&#58;&#47;&#47;www&#46;finanzen&#46;net&#47;suchergebnis&#46;asp&#63;" on this server.<P>
Reference&#32;&#35;18&#46;9f034917&#46;1702562512&#46;ad377c1
</BODY>
</HTML>
sandman@mothership:~

If I get the page in Firefox and copy the actual curl request from within the browser instead, it works just file:

Bash:

sandman@mothership:~ curl 'https://www.finanzen.net/aktien/carl_zeiss_meditec-aktie' --compressed -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br' -H 'DNT: 1' -H 'Sec-GPC: 1' -H 'Connection: keep-alive' -H 'Cookie: at_check=true; mbox=session#30e9195ff95d42549383dfdd023a471c#1702564179; _sp_v1_ss=1:H4sIAAAAAAAAAItWqo5RKimOUbKKxs_IAzEMamN1YpRSQcy80pwcILsErKC6lpoSSrEA-EAOLpYAAAA%3D; _sp_v1_p=505; _sp_v1_data=686534; _sp_su=false; googleanalytics_consent=active=true; fintargeting_consent=active=true; jwplayer_consent=active=true; euconsent-v2=CP2xrcAP2xrcAAGABCENAdEgAP_gAEAAACQgJFBR5DrFDGFBMHBaYJEAKYgWVFgAQEQgAAAAAQABAAGAcAQCw2AiIASABCAAAQAAgAABAAAECAEEAAAAAAAEAAAAAAAAgAAIIABAABEAAgIQAAoAAAAAEAAAAAABAAAAmAAQAALAAAQAQAAQAAAAACAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAIAAAAAAQAAAAABBDmA_AAoACwAKgAcABAACKAE4AUAAyABoAEQAJgATwA3gBzAEQAJwAfoBKQC5gGKANwAlYBLQCdgFDgLzAX8AxkBjgDIQG6gQ5ARBAAQF_BIBYAVQA_ACGAEcAPwAigBGgCSgJEAYMBIoKAIAAUACKAE4AUABzAS0Av4BjIDHAgAUADYAPgBCAEcAJ2KAAgEcGAAQCODoDgACwAKgAcABAAEQAJgAVQAxABvAD9AIYAiABOAD8AIoAR0AkoBKQCxAFzAMUAbgBF4CRAE7AKHAXmBDkCRQ4AiABcAGQANAAngCEAEcAP0AhABEQCLAEZAI4ATsBKwDBgGQgN1LQAQBHFgAIBHAwAQAEQBsgENgJaIQCgAFgBMACqAGIAN4AjgCKAEpAMUBIogAFAIyARwAsQBcwGeEoB4ACwAOABEACYAFUAMUAhgCIAEcAPwAuYBigEXgJEAXmBIokAGAAuAIQAjIBHAErAM8KQFwAFgAVAA4ACAAIgATAAqgBiAD9AIYAiAB-AEdAJKASkAuYBuAEXgJEATsAocBeYEOQJFFAB4ACgALgAyABoAE8AQgAjgBOAD9AIsARwAsQBigGeAN1AA.YAAAAAAAAAAA; consentUUID=5af7ba19-e98a-4421-8f38-b8fd3e6fe64a_26; gpt_ppid50=eM3MpclsV8LrAB4NVhn1NJcQTtUICaaHjLMCEp7p0nUeXR53bs' -H 'Upgrade-Insecure-Requests: 1' -H 'Sec-Fetch-Dest: document' -H 'Sec-Fetch-Mode: navigate' -H 'Sec-Fetch-Site: none' -H 'Sec-Fetch-User: ?1' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache'
...the page html removed here...
sandman@mothership:~

Next step for you would be to start stripping down the curl command to see how much you can remove before getting an error. When you've reached the bare minimum you know what to impersonate in your B4X code.

DonManfred · Dec 14, 2023

Filippo said:
Can this block be lifted?

Contact the website author/admin and ask to unblock you.

Maybe try to add a customized Header to "simulate" being the request from a Firefox-browser

B4X:

Dim j As HttpJob
j.Initialize("job name", Me)
j.Download(<link>) 'it can also be PostString or any of the other methods
j.GetRequest.SetHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")

Sandman · Dec 14, 2023

Quick follow-up. Just as I expected, you just need to set a user-agent that they can accept. Here's the one from my example above. That's all that's needed to get the html.

Bash:

curl 'https://www.finanzen.net/aktien/carl_zeiss_meditec-aktie' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0'

Filippo · Dec 14, 2023

DonManfred said:
Contact the website author/admin and ask to unblock you.

Maybe try to add a customized Header to "simulate" being the request from a Firefox-browser

B4X:

Dim j As HttpJob j.Initialize("job name", Me) j.Download(<link>) 'it can also be PostString or any of the other methods j.GetRequest.SetHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")

Thank you! It works perfectly.

@Sandman
Many thanks!

Filippo · Apr 11, 2024

DonManfred said:
Contact the website author/admin and ask to unblock you.

Maybe try to add a customized Header to "simulate" being the request from a Firefox-browser

B4X:

Dim j As HttpJob j.Initialize("job name", Me) j.Download(<link>) 'it can also be PostString or any of the other methods j.GetRequest.SetHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")

Too bad, it worked until a few days ago.

Is there perhaps another possibility or a trick?

DonManfred · Apr 11, 2024

Filippo said:
Too bad, it worked until a few days ago.

Probably they did changed the System behind. Adding a new security-layer or something.
How many requests are you doing per minute,hour, day? Maybe you are doing it to much often for them to block you.

Contact finanzen.net and ask for an Api which you then can use.

peacemaker · Apr 11, 2024

Just tracked your requests, often ones, or long-period ones from the fixed IP address. It's standard defence of the web-sites against the grabber apps.
Try to change the user-agent after some requests batch.
Proxy servers are for such tasks.

Sandman · Apr 11, 2024

Filippo said:
Is there perhaps another possibility or a trick?

Shouldn't be needed, this continues to work just fine:

Sandman said:
Quick follow-up. Just as I expected, you just need to set a user-agent that they can accept. Here's the one from my example above. That's all that's needed to get the html.

Bash:

curl 'https://www.finanzen.net/aktien/carl_zeiss_meditec-aktie' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0'

Either you're not mimicking the curl request well enough, or you're hammering their server so much that they blocked you.

Filippo · Apr 11, 2024

Firefox/120.0'

Perfect! Now it works again, thank you!

Sandman said:
Either you're not mimicking the curl request well enough, or you're hammering their server so much that they blocked you.

The request is only sent by a single app (my private app), maybe 1-2 per week. That should not be the problem.

DonManfred said:
Contact finanzen.net and ask for an Api which you then can use.

I know the site has an API, but it's not free, and for the small number of requests I send, it's not worth it.

Filippo · May 18, 2025

Hi guys,

it worked for exactly 1 year, now I get the same error message again:

ResponseError. Reason: , Response: <HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http://www.finanzen.net/suchergebnis.asp?" on this server.<P>
Reference #18.c6011002.1702559562.75b01d0
</BODY>
</HTML>

What should I change now to make it work again?
Does anyone have any more tips?

Many thanks in advance

peacemaker · May 19, 2025

Don't you change the user agent dynamically ?

Filippo · May 19, 2025

peacemaker said:
Don't you change the user agent dynamically ?

Hi @peacemaker,

I always use the same haeder:

job.Download(sURL)
job.GetRequest.SetHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/120.0")

Sandman · May 19, 2025

I had a look. They've simply blocked that specific user agent, nothing much to worry about. You had all instructions you needed in #4 to solve this, but I might as well post one solution for you.

This works fine.

Bash:

curl 'https://www.finanzen.net/aktien/carl_zeiss_meditec-aktie' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0'

Meaning, for your code, try this as the user agent:

B4X:

Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0

One way to make your code more future-proof is to take a number of user agents from a site like this and just randomize between them for each call:

The Latest and Most Common User Agents List (Updated Weekly)

A self-updating list of the most current useragents across operating systems and browsers

www.useragents.me

Filippo · May 19, 2025

Hi @Sandman ,

Thank you very much for your answer.

After trying all possible user agents, I think the website is blocking everything, I always get the same answer:

*** Receiver (httputils2service) Receive (first time) ***
ResponseError. Reason: , Response: <HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http://www.finanzen.net/suchergebnis.asp?" on this server.<P>
Reference #18.9c41402.1747645328.f6fea6a
<P>https://errors.edgesuite.net/18.9c41402.1747645328.f6fea6a</P>
</BODY>
</HTML>

Sandman · May 19, 2025

In that case I would suggest this:

1. Try to run the code from another public IP. Just to make sure they haven't blocked you based on your IP.

...and if that works...

2. Figure out a way that's simple for you to inspect your network traffic. (The curl command in #16 still works perfecly for me.)

In any case, it wouldn't be a bad idea for you to install curl to your machine so you could try it out yourself:

curl

curl.se

Filippo · May 19, 2025

Sandman said:
2. Figure out a way that's simple for you to inspect your network traffic. (The curl command in #16 still works perfecly for me.)

Have you tried this with my code from post #1?

Sandman · May 19, 2025

Filippo said:
Have you tried this with my code from post #1?

Sure, I do it all the time when I use the emulator and communicate with my own server API. Great way to see the requests and responses. It's really simple, too. Just install mitmproxy and use the proxy settings in the emulator and you can easily follow all the chatter.

mitmproxy - an interactive HTTPS proxy

mitmproxy.org

However, if this is something you're not used to, I'd recommend first installing curl and also trying another IP before doing the proxy thing.

Android Question Access denied when downloading a website as text.

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Similar Threads