筆記:常用的 .htaccess 設定資料

2021080110:49
適用於 apache httpd


擋掉惡意的來源

apache httpd 阻擋惡意的 IP
Order allow,deny
Allow from all
Deny from xx.yy.zz.cc 1.2.3.0/24

# 其它 IP 的寫法
# 1.2.3.0/24  指擋掉 1.2.3.x 整組 Class C IP
# 1.2.3.0/255.255.255.0 = 1.2.3.0/24 這兩種寫法意思相同 
  參考維基百科

or

<Files *>
   <RequireAll>
    Require all granted
    Require not ip 192.168.0.150
    Require not ip 192.168.0.0/24
    Require not ip 1.2.3.0/24
  </RequireAll>
</Files>


**
查看 httpd 有啟用那些模組 module:
apachectl -M
 
另外,若網站是經由 CDN (如 Cloudflare) 處理,就要改用 X-Forwarded-For 取得來源 IP
像這樣
SetEnvIf X-Forwarded-For "^185\.191\.1" ip_deny
SetEnvIf X-Forwarded-For "^1\.2\.3\.4" ip_deny

Order allow,deny
Allow from all
Deny from env=ip_deny
 

壓縮

在 web server 將檔案壓縮後再傳送給 user,主要是針對「文字格式」的檔案 (如 html/css/js..)
圖檔、影片檔案已壓縮過,不可在此壓縮  會浪費主機 cpu
<IfModule mod_deflate.c>
  AddOutputFilterByType DEFLATE text/plain
  AddOutputFilterByType DEFLATE text/html
  AddOutputFilterByType DEFLATE text/xml
  AddOutputFilterByType DEFLATE text/css
  AddOutputFilterByType DEFLATE application/xml
  AddOutputFilterByType DEFLATE application/xhtml+xml
  AddOutputFilterByType DEFLATE application/rss+xml
  AddOutputFilterByType DEFLATE application/javascript
  AddOutputFilterByType DEFLATE application/x-javascript
  AddOutputFilterByType DEFLATE application/x-httpd-php
  AddOutputFilterByType DEFLATE image/svg+xml
</IfModule>

檔案暫存在使用者端的時間設定
<IfModule mod_expires.c>
  ExpiresActive On
  ExpiresByType image/jpg "access 1 year"
  ExpiresByType image/jpeg "access 1 year"
  ExpiresByType image/gif "access 1 year"
  ExpiresByType image/png "access 1 year"
  ExpiresByType text/css "access 1 month"
  ExpiresByType application/pdf "access 1 month"
  ExpiresByType text/x-javascript "access 1 month"
  ExpiresByType image/x-icon "access 1 year"
  ExpiresDefault "access 2 days"
</IfModule>


當用戶以 http: 瀏覽時,自動轉為 https:
<IfModule mod_rewrite.c>
  RewriteEngine on
  RewriteBase /
  RewriteCond %{SERVER_PORT} !^443$
  RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [L,R]
</IfModule>

 

擋掉有問題的 query string (寫法1)

也就是說 網址後面含有這些字串的,都會被擋掉
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} (\"|%22).*(<|>|%3) [NC,OR]
RewriteCond %{QUERY_STRING} (javascript:).*(\;) [NC,OR]
RewriteCond %{QUERY_STRING} (<|%3C).*script.*(>|%3) [NC,OR]
RewriteCond %{QUERY_STRING} (\\|\.\./|`|='$|=%27$) [NC,OR]
RewriteCond %{QUERY_STRING} (\;|'|\"|%22).*(union|select|insert|drop|update|md5|benchmark|or|and|if) [NC,OR]
RewriteCond %{QUERY_STRING} (base64_encode|localhost|mosconfig) [NC,OR]
RewriteCond %{QUERY_STRING} (boot\.ini|etc/passwd) [NC,OR]
RewriteCond %{QUERY_STRING} (GLOBALS|REQUEST)(=|\[|%) [NC]
RewriteRule .* - [F]
</IfModule>

NC: non case-sensitive 忽略大小寫
OR: 就是 "or 或"
F: forbdien 就是擋掉的意思

RewriteCond 可用的變數:
%{REQUEST_URI}
%{QUERY_STRING}
%{HTTP_HOST}
%{HTTP_COOKIE}
%{HTTPS}
%{HTTP_USER_AGENT}
%{REQUEST_FILENAME}
https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritecond

擋掉有問題的 query string (寫法2)

直接給他 403
#擋掉有問題的 query string#2
#直接給他 403
<IfModule mod_alias.c>
 RedirectMatch 403 \%27 
 RedirectMatch 403 (?i)/\&(t|title)=
 RedirectMatch 403 (?i)/\.(bash|git|hg|log|svn|swp|tar)
 RedirectMatch 403 (?i)/(1|contact|i|index1|iprober|phpspy|product|signup|t|test|timthumb|tz|visit|webshell|wp-signup).php
 RedirectMatch 403 (?i)/(author-panel|class|database|manage|phpMyAdmin|register|submit-articles|system|usage|webmaster)/?$
 RedirectMatch 403 (?i)/(=|_mm|cgi|cvs|dbscripts|jsp|rnd|shadow|userfiles)
 RedirectMatch 403 /\$\&
</IfModule>
 

擋掉討厭的爬蟲、機器人

==> 依據 user agent 名稱含有下列關鍵字的,都會被擋掉
<IfModule mod_setenvif.c>
#不分大小寫
SetEnvIfNoCase User-Agent (WBSearchBot|ias_crawler|SemrushBot|CCBot|AhrefsBot|ltx71) bad_bot
SetEnvIfNoCase User-Agent (SurdotlyBot|BUbiNG|MegaIndex|Exabot|ntiny|NativeHost) bad_bot
SetEnvIfNoCase User-Agent (GSLFbot|SWEBot|Slurp|Baidu|YoudaoBot|sogou|MLBot|TwengaBot-Discover) bad_bot
SetEnvIfNoCase User-Agent (Purebot|Sosospider|HTTrack|WebZIP|libwww|NaverBot|SURF|tele) bad_bot
SetEnvIfNoCase User-Agent (TurnitinBot|WMFSDK|NSPlayer|ZyBorg|sohu-search|Crawler|Indy) bad_bot
SetEnvIfNoCase User-Agent (LinkWalker|DTS|WebFetch|psbot|EMPAS_ROBOT|NetCarta|AmigaPort|Harvest) bad_bot
SetEnvIfNoCase User-Agent (Scooter|NaviPress|Downes|Buddy|RMA|NutchCVS|TutorGigBot|Webinator) bad_bot
SetEnvIfNoCase User-Agent (Yeti|cfetch|Holmes|PHPOpenChat|HappyFunBot|PussyCat|Shockwave) bad_bot
SetEnvIfNoCase User-Agent (click|Gaint|BSC|msnbot-media|GetRight|SurdotlyBot|Qwantify) bad_bot
SetEnvIfNoCase User-Agent (PChomebot|BLEXBot|JikeSpider|AlphaBot|qihoobot|Trend|webbot) bad_bot
SetEnvIfNoCase User-Agent (wcpan|MeroBot|Offline|Glimpse|MFHttpScan|WebCopier|User-Agent) bad_bot
SetEnvIfNoCase User-Agent (sqlmap|dotbot) bad_bot

  # Apache < 2.3
  <IfModule !mod_authz_core.c>
  Order Allow,Deny
  Allow from all
  Deny from env=bad_bot
  </IfModule>

  # Apache >= 2.3
  <IfModule mod_authz_core.c>
    <RequireAll>
    Require all Granted
    Require not env bad_bot
    </RequireAll>
  </IfModule>
</IfModule>

The mod_setenvif module allows you to set internal environment variables according to whether different aspects of the request match regular expressions you specify. These environment variables can be used by other parts of the server to make decisions about actions to be taken, as well as becoming available to CGI scripts and SSI pages.
https://httpd.apache.org/docs/current/mod/mod_setenvif.html

 
 

擋掉特定副檔名下載

<FilesMatch "\.(bak|sql)$">
  order allow,deny
  deny from all
</FilesMatch>
 

WordPress

wordpress 內定的 .htaccess 寫法:
簡單說明:
就是任何找不到的檔案(或目錄),就轉由 index.php 處理
例如網址 https://www.xxx.com.tw/aabbcc11
若網站中找不到  "aabbcc11" 這樣的檔案或目錄
則 web server 自動執行 index.php
# BEGIN WordPress
<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteBase /
  RewriteRule ^index\.php$ - [L]
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteRule . /index.php [L]
</IfModule>
# END WordPress

L:last, 執行到這裡 就停止 (不往下執行)
-f:有這個檔案時
-d:有這個目錄時

 

個人部落格的話,通常用不到 XML-RPC 服務
且一堆爬蟲會測試  xmlrpc.php 這個檔案內容,徒增資安風險
建議直接禁止讀取此檔案:
<files xmlrpc.php>
  order allow,deny
  deny from all
</files>
參考 WordPress 如何關閉 XML-RPC 服務,避免資安攻擊風險 - iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天 (ithome.com.tw)

 

網站自動轉為  https (SSL/TLS)

RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
 
 

關於轉址

原始網址  http://www.abc.com/xxx/123.jpg  希望改成 http://www.abc.com/hello/123.jpg
原始網址  http://www.abc.com/zzz/888.jpg  希望改成 http://www.abc.com/book/888.jpg

可以這樣寫
RewriteRule ^hello/(.*) /xxx/$1 [L]
RewriteRule ^book/(.*) /zzz/$1 [L]

QSA:Query string
參考 https://httpd.apache.org/docs/2.4/rewrite/flags.html#flag_qsa


又或者這樣設定
RewriteRule ^([_0-9a-zA-Z-]+/)hello/(.*) /xxx/$2 [L]
意思是
http://www.abc.com/123/hello/123.jpg  可以對應到真實網址-->   http://www.abc.com/xxx/123.jpg
http://www.abc.com/abc123/hello/123.jpg 也可以對應到真實網址-->   http://www.abc.com/xxx/123.jpg

這個用處:
   可用於避開 CDN 暫存問題 (特別是 AWS Cloudfront 之類,不吃 Query String 者)
   假如一張圖片  http://www.abc.com/xxx/123.jpg
   當修改過後,CDN 並不會由"源伺服器" 取得最新的 123.jpg 再傳給用戶
   即使加上 query string 如 http://www.abc.com/xxx/123.jpg?v=1234 , CDN 也會忽略 query string,如 AWS Cloudfront
   透過修改資料夾名稱,例如動態改為
      http://www.abc.com/abc202209_001/hello/123.jpg
      http://www.abc.com/abc202209_002/hello/123.jpg
     ::  ::
    這樣 CDN 就誤以為是不同檔案,就會讓用戶看到最新圖片

 

擋掉特定結尾擋名

RewriteRule (.[^\.]).(tar|zip)$ - [NC,F,L]
網址中,檔名結尾是 .tar、.zip ,直接 403 結束


其它參考:
禁止特定目錄中的 *.php 被執行