Skip to content

生成式 AI 時代的資料安全與管理

在生成式 AI 快速發展的時代,企業在資料安全與管理上面臨獨特挑戰。我們最新白皮書提出了一套整合方案,結合可觀測性(Observability)、數據安全態勢管理(DSPM)與資料檢測及響應(DDR),以應對這些挑戰,確保合規性、建立信任並推動創新。

 

引言


生成式 AI 已成為跨產業的變革引擎,助力企業自動化複雜任務、產出深刻洞察並生成精細內容。然而,其應用也為數據安全與管理帶來顯著挑戰。這些模型仰賴大量敏感資料運作,因而引發資料私隱、法規遵循與潛在濫用的隱憂。

 

生成式 AI 下的資料安全與管理挑戰

  1. 資料規模與複雜性:生成式 AI 依賴龐大資料集,涵蓋敏感與個人資訊,管理和保護如此海量資料極具挑戰。
  2. 自主決策風險:AI 系統能在無人監督下自主決策,可能導致意外資料外洩或違規行為。
  3. 資料流透明度不足:追蹤資料在 AI 模型中的流動並理解其決策邏輯困難重重,阻礙風險識別與管控。
  4. 法規遵循壓力:GDPR、CCPA 等法規對資料保護要求嚴格,企業必須確保 AI 應用符合相關規範。
  5. 內部威脅與未授權存取:AI 系統的導入可能為內部威脅開闢新途徑,例如員工濫用敏感資料存取權。
  6. 威脅格局的演進:網絡攻擊日益精進,AI 系統既是潛在目標,也可能成為攻擊者的工具。

 

解決之道:整合式策略


為因應這些挑戰,企業需採取全面安全策略,融合以下三大支柱:


1. 可觀測性(Observability)

可觀測性通過監控系統輸出,洞悉其內部運作。在生成式 AI 場景中:

  • 即時監控:持續追蹤 AI 模型,發現異常或意外行為以提示潛在安全問題。
  • 透明性:揭示資料流與決策過程,提升合規性並便於審計。
  • 效能監測:關注效能指標,識別可能指向安全漏洞的異常變化。


2. 數據安全態勢管理(DSPM)

DSPM 聚焦於掌握並強化企業資料的安全狀態:

  • 資料發現與分類:定位敏感資料並依風險等級分類。
  • 政策執行:制定並落實資料存取與使用規範,確保 AI 模型合規運作。
  • 風險評估:定期檢視安全態勢,主動修補潛在漏洞。


3. 資料檢測及響應(DDR)

DDR 致力於檢測資料威脅並迅速應對:

  • 威脅識別:運用進階分析即時發現資料洩露或濫用跡象。
  • 事件應對:建立快速反應機制,減輕安全事件影響。
  • 修復行動:採取矯正措施,例如修補漏洞或更新政策,預防問題重演。

 

三大支柱的協同運作


整合可觀測性、DSPM 與 DDR,企業可構建強健的安全框架:

  • 元件互助:可觀測性為 DSPM 與 DDR 提供關鍵資料支援。
  • 動態回饋:DDR 的洞察協助 DSPM 策略,DSPM 的政策則強化可觀測性成效。
  • 全面防護:從預防到檢測與應對,涵蓋數據安全的各個面向。

 

論述核心


若欠缺完善的安全策略便推行生成式 AI,無異於建造無基之屋。資料洩露、合規失誤與聲譽損害的風險不容小覷。整合方法能實現:

  • 落實管理:政策不僅是紙上談兵,而是透過技術切實執行。
  • 風險預防:持續監控與評估,及早發現並化解問題。
  • 合規保障:確保符合法規要求,避免法律風險。
  • 信任奠基:讓客戶與合作夥伴對企業的資料管理能力深具信心。

 

結語


隨著生成式 AI 日益融入企業營運,保護與管理其所用資料不僅是技術必需,更是策略要務。企業需採納整合可觀測性、數據安全態勢管理與資料檢測及響應的策略,方能有效管控風險。如此,不僅能守護資產與聲譽,更能釋放生成式 AI 的全部潛力,以安全合規的方式驅動創新與競爭優勢。

關於 Getvisibility

Getvisibility 賦予企業在所有環境中實現全面的數據可視性與脈絡理解。我們度身訂做的 AI 解決方案能無縫融入您的技術生態系統,持續識別並評估風險優先級,並主動管理您的保護範圍。Getvisibility 的創立基於一個信念:企業應當對其數據擁有完全的可視性、理解力和控制權。我們看到市場對於一種解決方案的需求,這種方案能夠幫助企業保護敏感資訊,並確保遵守數據私隱法規。Getvisibility 是全球數百家企業企業信賴的合作夥伴,協助他們自信地應對數碼環境,保護他們最珍貴的資產 —— 數據。我們是一群問題解決者的團隊,致力於通過賦能企業對其數據做出明智決策,為世界帶來正面影響。

關於 Version 2 Digital
資安解決方案 專業代理商與領導者
台灣二版 ( Version 2 ) 是亞洲其中一間最有活力的 IT 公司,多年來深耕資訊科技領域,致力於提供與時俱進的資安解決方案 ( 如EDR、NDR、漏洞管理 ),工具型產品 ( 如遠端控制、網頁過濾 ) 及資安威脅偵測應 變服務服務 ( MDR ) 等,透過龐大銷售點、經銷商及合作伙伴,提供廣被市場讚賞的產品及客製化、在地化的專業服務。

台灣二版 ( Version 2 ) 的銷售範圍包括台灣、香港、中國內地、新加坡、澳門等地區,客戶涵 蓋各產業,包括全球 1000 大跨國企業、上市公司、公用機構、政府部門、無數成功的中小企業及來自亞 洲各城市的消費市場客戶。

什麼是數據安全自動化(Data Security Automation)?為何它在 2025 年網絡安全中不可或缺?

在網絡威脅日益猖獗、數據生態系統不斷擴張的時代,數據安全自動化已成為現代網絡安全策略的核心支柱。進入 2025 年,企業不僅要面對保護敏感資訊免受進階威脅侵害的空前挑戰,還需應對嚴格的法規要求與有限的資源限制。本文將深入探討數據安全自動化的定義、其關鍵價值,以及企業如何運用它建構堅實的防禦體系。

 

什麼是數據安全自動化?


數據安全自動化利用技術與自動化流程,保護敏感數據免於未經授權的存取、洩露或濫用。透過機器學習(ML)、人工智能(AI)及安全編排平台等工具,企業得以自動執行重複性安全任務,提升威脅檢測能力,並提升數據相關事件的應對效率。其核心功能包括:

  • 自動數據發現與分類
  • 即時監控與風險評估
  • 政策執行與存取管控
  • 自動修復措施
  • 全面報告與合規管理

 

為何 2025 年數據安全自動化至關重要?


邁入 2025 年,以下三大趨勢凸顯數據安全自動化的必要性:


1. 網絡威脅日益複雜:網絡犯罪分子採用更精密的手段,甚至借助自動化技術發動攻擊。面對自動化的勒索軟件、釣魚詐騙與內部威脅,手動檢測與應對已顯得力不從心。


2. 數據量爆炸性增長:預計 2025 年,全球數據總量將激增至 180 澤位元組(zettabytes),受雲端技術、物聯網與邊緣運算推動。人工管理與保護如此龐大的數據規模已不再可行。


3. 法規要求日益嚴格:GDPR、CCPA、HIPAA 等法規要求企業建立完善的數據保護機制。自動化能減少人為錯誤,確保持續符合合規標準。

 

數據安全自動化的核心應用


1. 數據分類與加密

自動化工具可依數據敏感度進行分類,並套用適當加密。例如,機器學習能辨識個人身份資訊(PII)或財務數據,自動加以保護。


2. 威脅檢測與風險評估

自動化系統分析海量網絡與終端數據,快速識別潛在洩露的異常徵兆,提升風險預警與主動防禦能力。


3. 數據遺失防護(DLP)

自動化 DLP 解決方案可監控數據流動,執行規則以防止敏感資訊未經授權外洩,例如禁止機密文件上傳至未批准的雲端服務。


4. 合規性監控

自動化持續追蹤合規需求,在問題演變成重大風險前發出警示。


4. 事件應對與恢復

發生數據洩露時,自動化能迅速隔離受影響系統、封鎖惡意 IP 或還原未經授權的變更,加快應對速度。


自動化安全流程的優勢

  • 快速響應:自動化系統能在數秒內檢測並化解威脅,遠勝於手動流程的數小時或數天。
  • 減少失誤:一致的任務執行降低人為疏忽或配置錯誤的風險。
  • 成本效益:自動化重複性任務可降低營運成本,讓人力專注於策略性工作。
  • 靈活擴展:自動化解決方案能輕鬆因應數據增長與安全挑戰的演進。

 

實施數據安全自動化的挑戰


儘管優勢顯著,企業在推行數據安全自動化時仍需克服以下挑戰:

  • 系統整合難度:確保自動化工具與既有系統和工作流程順暢銜接。
  • 初期投入成本:先進自動化平台的高昂費用可能讓中小型企業卻步。
  • 技能不足:團隊需接受培訓以有效管理與改善自動化系統。
  • 誤報問題:未經妥善調整的系統可能產生過多警報,增加安全團隊負擔。

 

如何推行數據安全自動化?

  • 評估現有安全現況:找出安全流程中的漏洞與低效環節,鎖定重複性、耗時或易出錯的任務。
  • 挑選適當工具:選擇符合需求的平台,例如 SOAR(安全編排、自動化與響應)、AI 分析工具或自動化 DLP 系統。
  • 設定明確目標:制定可量化的指標,如縮短事件響應時間或提升合規率。
  • 測試與檢視:逐步導入自動化,監控成效並持續調整,減少誤報並提升效率。
  • 培訓團隊:為安全團隊提供必要的知識與技能,確保自動化系統發揮最大效益。

 

數據安全自動化的未來展望


隨著 AI 與預測分析技術的進步,數據安全自動化將迎來更大突破。值得關注的趨勢包括:

  • 自主安全系統:無需人工介入即可檢測、分析並中和威脅的全面自動化框架。
  • 行為分析:基於使用者和實體行為的異常檢測能力進一步提升。
  • 跨平台整合:實現混合雲端、本地系統與物聯網環境的無縫自動化。

 

結語


數據安全自動化已不再是可有可無的選擇,而是任何希望在 2025 年及未來蓬勃發展的企業不可或缺的利器。通過擁抱自動化,企業不僅能有效保護數據,還能在數碼時代中贏得競爭優勢。成功的關鍵在於選對工具、克服實施障礙並持續改善流程。


您準備好讓企業的數據安全迎接未來了嗎?立即探索自動化,充分釋放您的網絡安全潛力。

關於 Getvisibility

Getvisibility 賦予企業在所有環境中實現全面的數據可視性與脈絡理解。我們度身訂做的 AI 解決方案能無縫融入您的技術生態系統,持續識別並評估風險優先級,並主動管理您的保護範圍。Getvisibility 的創立基於一個信念:企業應當對其數據擁有完全的可視性、理解力和控制權。我們看到市場對於一種解決方案的需求,這種方案能夠幫助企業保護敏感資訊,並確保遵守數據私隱法規。Getvisibility 是全球數百家企業企業信賴的合作夥伴,協助他們自信地應對數碼環境,保護他們最珍貴的資產 —— 數據。我們是一群問題解決者的團隊,致力於通過賦能企業對其數據做出明智決策,為世界帶來正面影響。

關於 Version 2 Digital
資安解決方案 專業代理商與領導者
台灣二版 ( Version 2 ) 是亞洲其中一間最有活力的 IT 公司,多年來深耕資訊科技領域,致力於提供與時俱進的資安解決方案 ( 如EDR、NDR、漏洞管理 ),工具型產品 ( 如遠端控制、網頁過濾 ) 及資安威脅偵測應 變服務服務 ( MDR ) 等,透過龐大銷售點、經銷商及合作伙伴,提供廣被市場讚賞的產品及客製化、在地化的專業服務。

台灣二版 ( Version 2 ) 的銷售範圍包括台灣、香港、中國內地、新加坡、澳門等地區,客戶涵 蓋各產業,包括全球 1000 大跨國企業、上市公司、公用機構、政府部門、無數成功的中小企業及來自亞 洲各城市的消費市場客戶。

Getvisibility DDR 使用案例:PIP 員工的隱藏危機

這是每個雇主避之不及的噩夢場景,卻比我們想像的更常發生。試想一位資深項目經理,多年來一直是團隊的中堅力量,推動成果,助力公司成功。然而,風向變了。近幾個月,他顯得心不在焉,屢屢錯過期限,對建議置若罔聞,工作敷衍了事,儼然成了「靜默辭職」的化身。所有挽回他的努力均告無效,最終,他被納入績效改進計劃(Performance Improvement Plan,PIP)。

接下來的情節既在意料之中,又暗藏危機。面對前途未卜,這名員工開始謀劃職業下一步。他更新履歷,聯繫獵頭,尋找外部機會。然而,他的行動不止於此。

在離職前的最後幾週,他開始觸碰禁區:敏感客戶合約、專有設計、財務數據 —— 任何他認為能助其跳槽的資料。他將文件下載至個人設備,自我安慰這是「未來保障」。無人察覺。直到他離開,一切為時已晚。

 

公司瞬間陷入三重危機:

法規風險:若下載資料含個人身份資訊(PII)或財務紀錄,一旦外洩或處理不當,GDPR、HIPAA 等法規將帶來巨額罰款。

信任崩塌:洩密消息傳至客戶、夥伴與員工耳中。信任如薄冰,一次失誤足以摧毀多年苦心經營的聲譽。

競爭劣勢:若這名前員工攜機密(如定價策略、商業機密、客戶提案)投奔對手,公司恐失合約、信譽受創,甚至難以重振旗鼓。

 

為何PIP員工格外危險

被列入 PIP 的員工背負沉重壓力。他們感到走投無路、被低估,前景迷霧重重。這種情緒常催生高風險行為:

憤怒轉為報復:部分員工出於怨恨,故意洩露敏感資料,意圖損害他們認為辜負自己的組織。

絕望催生竊取:有人視機密數據為求職捷徑,利用它謀取新機會的優勢。

疏忽埋下隱患:即使無惡意,員工也可能因製作作品集而複製文件,無意間製造數據外洩的漏洞。

更糟的是,若缺乏適當監控,這些風險將潛伏無形,直到爆發為代價慘重的危機。

 

Getvisibility DDR 如何扭轉局面

有了 Getvisibility DDR,這一切將截然不同。回溯場景:當項目經理開始存取職權外的文件時,Getvisibility DDR 會立刻捕捉異常。其 AI 驅動的智能系統將標記這一異動,促使公司迅速應對。以下是 Getvisibility DDR 如何防範內部威脅:

異常行為早期預警

Getvisibility DDR 全天候監控雲端、本地與混合環境的數據互動。當員工存取人力資源檔案、財務數據或敏感客戶文件時,系統會立即辨識出與其日常行為的偏離。

實時遏制措施

DDR 不待人工介入,立即採取行動:

  • 限制員工對敏感文件的存取權限。
  • 封鎖下載功能,防止數據外流。
  • 通知 SOC 團隊與文件擁有者展開調查。

深度洞察支援決策

DDR 不僅發出警報,還提供詳盡脈絡。SOC 團隊可查看存取的文件、時間、頻率,並比對員工歷史行為,實現快速且精準的判斷。

跨部門協同防護

系統同步通知相關文件擁有者,啟動跨部門審查。若確認行為為故意,存取權將被永久撤銷,事件移交相關團隊處理。

 

結局:業務無虞,數據安全

若配備Getvisibility DDR,結果將是:

  • 法規無憂:敏感數據留在企業內,避免罰款與監管壓力。
  • 資訊保密:專有資料不外洩,競爭對手無機可乘。
  • 信任穩固:客戶、員工與利益相關者無需質疑公司的數據保護能力。

公司不必疲於應對洩密後果,而是專注於改善流程、增進信任與開創未來。

 

內部威脅:數據不容小覷

這並非特例,而是普遍現象。看看這些數字:

  • 25% 的內部威脅來自惡意內部人士(員工或承包商),他們濫用權限謀私利。
  • 59% 的離職或被解僱員工承認帶走機密或敏感資訊。
  • 31% 的 2023 年數據洩露源於內部威脅,顯示內患與外敵同樣致命(IBM, 2023)。

 

守護數據,守護企業

Getvisibility DDR 提供以下利器:

  • 精準監控:實時追蹤所有環境中的文件活動。
  • 迅捷應對:威脅浮現時即刻遏制,降低損害。
  • 信譽保障:向客戶、員工與監管機構展現您對數據安全的承諾。

在數據安全至關重要的時代,依賴過時工具與被動策略已不足以應對挑戰。有了 Getvisibility DDR,您將獲得可視性、掌控力與安心,讓企業屹立不倒。別等到危機暴露漏洞,現在就保護您的數據。

立即預約演示,探索 Getvisibility DDR 如何守護您企業的未來。

關於 Getvisibility

Getvisibility 賦予企業在所有環境中實現全面的數據可視性與脈絡理解。我們度身訂做的 AI 解決方案能無縫融入您的技術生態系統,持續識別並評估風險優先級,並主動管理您的保護範圍。Getvisibility 的創立基於一個信念:企業應當對其數據擁有完全的可視性、理解力和控制權。我們看到市場對於一種解決方案的需求,這種方案能夠幫助企業保護敏感資訊,並確保遵守數據私隱法規。Getvisibility 是全球數百家企業企業信賴的合作夥伴,協助他們自信地應對數碼環境,保護他們最珍貴的資產 —— 數據。我們是一群問題解決者的團隊,致力於通過賦能企業對其數據做出明智決策,為世界帶來正面影響。

關於 Version 2 Digital
資安解決方案 專業代理商與領導者
台灣二版 ( Version 2 ) 是亞洲其中一間最有活力的 IT 公司,多年來深耕資訊科技領域,致力於提供與時俱進的資安解決方案 ( 如EDR、NDR、漏洞管理 ),工具型產品 ( 如遠端控制、網頁過濾 ) 及資安威脅偵測應 變服務服務 ( MDR ) 等,透過龐大銷售點、經銷商及合作伙伴,提供廣被市場讚賞的產品及客製化、在地化的專業服務。

台灣二版 ( Version 2 ) 的銷售範圍包括台灣、香港、中國內地、新加坡、澳門等地區,客戶涵 蓋各產業,包括全球 1000 大跨國企業、上市公司、公用機構、政府部門、無數成功的中小企業及來自亞 洲各城市的消費市場客戶。

Monitoring for PCI DSS 4.0 Compliance

If you grew up in the 80s and 90s, you probably remember your most beloved Trapper Keeper. The colorful binder contained all the folders, dividers, and lined paper to keep your middle school and high school self as organized as possible. Parsing JSON, a lightweight data format, is the modern, IT environment version of that colorful – perhaps even Lisa Frank themed – childhood favorite.

 

Parsing JSON involves transforming structured information into a format that can be used within various programming languages. This process can range from making JSON human-readable to extracting specific data points for processing. When you know how to parse JSON, you can improve data management, application performance, and security with structured data that allows for aggregation, correlation, and analysis.

What is JSON?

JSON, or JavaScript Object Notation, is a widely-used, human-readable, and machine-readable data exchange format. JSON structures data using text, representing it through key-value pairs, arrays, and nested elements, enabling data transfers between servers and web applications that use Application Programming Interfaces (APIs).

 

JSON has become a data-serialization standard that many programming languages support, streamlining programmers’ ability to integrate and manipulate the data. Since JSON makes it easy to represent complex objects using a clear structure while maintaining readability, it is useful for maintaining clarity across nested and intricate data models.

 

Some of JSON’s key attributes include:

  • Requires minimal memory and processing power
  • Easy to read
  • Supports key-value pairs and arrays
  • Works with various programming languages
  • Offers standard format for data serialization and transmission

 

How to make JSON readable?

Making JSON data more readable enables you to understand and debug complex objects. Some ways to may JSON more readable include:

  • Pretty-Print JSON: Pretty-printing JSON formats the input string with indentation and line breaks to make hierarchical structures and relationships between object values clearer.
  • Delete Unnecessary Line Breaks: Removing redundant line breaks while converting JSON into a single-line string literal optimizes storage and ensures consistent string representation.
  • Use Tools and IDEs: Tools and extensions in development environments that auto-format JSON data can offer an isolated view to better visualize complex JSON structures.
  • Reviver Function in JavaScript: Using the parse() method applies a reviver function that modifies object values during conversion and shapes data according to specific needs.

 

What does it mean to parse JSON?

JSONs are typically read as a string, so parsing JSON is the process of converting the string into an object to interpret the data in a programming language. For example, in JSON, a person’s profile might look like this:

{ “name”: “Jane Doe”, “age”: 30, “isDeveloper”: true, “skills”: [“JavaScript”, “Python”, “HTML”, “CSS”], }, “projects”: [ { “name”: “Weather App”, “completed”: true }, { “name”: “E-commerce Website”, “completed”: false } ] }

When you parse this JSON data in JavaScript, it might look like this:

Name: Jane Doe
Age: 30
Is Developer: true
Skills: JavaScript, Python, HTML, CSS|
Project 1: Weather App, Completed: true
Project 2: E-commerce Website, Completed: false

 

Even though the information looks the same, it’s easier to read because you removed all of the machine-readable formatting.

Partial JSON parsing

Partial JSON parsing is especially advantageous in environments like Python, where not all fields in the data may be available or necessary. With this flexible input handling, you can ensure model fields have default values to manage missing data without causing errors.

 

For example, if you only want to know the developer’s name, skills, and completed projects, partial JSON parsing allows you to extract the information you want and focus on specific fields.

 

Why is JSON parsing important?

Parsing JSON transforms the JSON data so that you can handle complex objects and structured data. When you parse JSON, you can serialize and deserialize data to improve data interchange, like for web applications.

 

JSON parsing enables:

  • Data Interchange: Allows for easy serialization and deserialization of data across various systems.
  • Dynamic Parsing: Streamlines integration for web-based applications as a subset nature of JavaScript
  • Security: Reduces injection attack risks by ensuring data conforms to expected format.
  • Customization: Transforms raw data into structured, usable objects that can be programmatically manipulated, filtered, and modified according to specific needs.

 

How to parse a JSON file

Parsing a JSON file involves transforming JSON data from a textual format into a structured format that can be manipulated within a programming environment. Modern programming languages provide built-in methods or libraries for parsing JSON data so you can easily integrate and manipulate data effectively. Once parsed, JSON data can be represented as objects or arrays, allowing operations like sorting or mapping.

 

Parsing JSON in JavaScript

Most people use the JSON.parse() method for converting string form JSON data into JavaScript objects since it can handle simple and complex objects. Additionally, you may choose to implement the reviver function to manage custom data conversions.

 

Parsing JSON in PHP

PHP provides the json_decode function so you can translate JSON strings into arrays or objects. Additionally, PHP provides functions that validate the JSON syntax to prevent exceptions that could interrupt execution.

 

Parsing JSON in Python

Parsing JSON in python typically means converting JSON strings into Python dictionaries with the json module. This module provides essential functions like loads() for strings and load() for file objects which are helpful for managing JSON-formatted API data.

 

Parsing JSON in Java

Developers typically use one of the following libraries to parse JSON in Java:

  • Jackson: efficient for handling large files and comes with an extensive feature set
  • Gson: minimal configuration and setup but slower for large datasets
  • json: built-in package providing a set of classes and methods

 

JSON Logging: Best Practices

Log files often have complex, unstructured text-based formatting. When you convert them to JSON, you can store and search your logs more easily. Over time, JSON has become a standard log format because it creates a structured database that allows you to extract the fields that matter to normalize them against other logs that your environment generates. Additionally, as an application’s log data evolves, JSON’s flexibility makes it easier to add or remove fields. Since many programming language either include structured JSON logging in their libraries or offer third-party libraries,

Log from the Start

Making sure that your application generates logs is critical from the very beginning. Logs enable you to debug the application or detect security vulnerabilities. By inserting the JSON logs from the start, you make your testing easier and build security monitoring into the application.

Configure Dependencies

If your dependencies can also generate JSON logs, you should consider configuring it because the structure format makes parsing and analyzing database logs easier.

Format the Schema

Since your JSON logs should be readable and parseable, you want to keep them as compact and streamlined as possible. Some best practices include:

  • Focusing on objects that need to be read
  • Flattening structures by concatenating keys with a separator
  • Using a uniform data type in each field
  • Parsing exception stack traces into attribute hierarchies

Incorporate Context

JSON enables you to include information about what you’re logging for insight into an event’s immediate context. Some context that helps correlate issues across your IT environment include:

  • User identifiers
  • Session identifiers
  • Error messages

 

Graylog: Correlating and Analyzing Logs for Operations and Security

 

With Graylog’s parsing JSON functions, you can parse out useful information, like destination address, response bytes, and other data that helps monitor security incidents or answer IT questions. After extracting the data you want, you can use the Graylog Extended Log Format (GELF) to normalize and structure all log data. Graylog’s purpose-built solution provides lightning-fast search capabilities and flexible integrations that allow your team to collaborate more efficiently.

Graylog Operations provides a cost-efficient solution for IT ops so that organizations can implement robust infrastructure monitoring while staying within budget. With our solution, IT ops can analyze historical data regularly to identify potential slowdowns or system failures while creating alerts that help anticipate issues.

With Graylog’s security analytics and anomaly detection capabilities, you get the cybersecurity platform you need without the complexity that makes your team’s job harder. With our powerful, lightning-fast features and intuitive user interface, you can lower your labor costs while reducing alert fatigue and getting the answers you need – quickly.

 

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

How I used Graylog to Fix my Internet Connection

In today’s digital age, the internet has become an integral part of our daily lives. From working remotely to streaming movies, we rely on the internet for almost everything. However, slow internet speeds can be frustrating and can significantly affect our productivity and entertainment. Despite advancements in technology, many people continue to face challenges with their internet speeds, hindering their ability to fully utilize the benefits of the internet. In this blog, we will explore how Dan McDowell, Professional Services Engineer decided to take matters into his own hands and get the data over time to present to his ISP.

Speedtest-Overview

 

Over the course of a few months, I noticed slower and slower internet connectivity. Complaints from neighbors (we are all on the same ISP) lead me to take some action. A few phone calls with “mixed” results were not good enough for me so I knew what I needed, metrics!

Why Metrics?

Showing data without a doubt is one of the most powerful ways to prove a statement. How often do you hear one of the following when you call in for support:

  • Did you unplug it and plug it back in?
  • It’s probably an issue with your router
  • Oh, wireless must be to blame
  • Test it directly connected to your computer!
  • Nothing is wrong on our end, must be yours…

In my scenario I was able to prove without a doubt that this wasn’t a “me” problem. Using data I gathered by running this script every 30 minutes over a few weeks time I was able to prove:

  • This wasn’t an issue with my router
    • The was consistent connectivity slowness at the same times every single day of the week and outside of those times my connectivity was near the offered maximums.
  • Something was wrong on their end
    • Clearly, they were not spec’d to handle the increase in traffic when people stop working and start streaming
    • I used their OWN speed test server for all my testing. It was only one hop away.
    • This was all the proof I needed:
  • End Result?
    • I sent in a few screenshots of my dashboards, highlighting the clear spikes during peak usage periods. I received a phone call not even 10 minutes later from the ISP. They replaced our local OLT and increased the pipe to their co-lo.
      What a massive increase in average performance!

Ookla Speedtest has a CLI tool?!

Yup. This can be configured to use the same speedtest server (my local ISP runs one) each run meaning results are valid and repeatable. Best of all, it can output JSON which I can convert to GELF with ease! In short, I setup a cron job to run my speed test script every 30 minutes on my Graylog server and output the results, converting the JSON message into GELF which NetCat sends to my GELF input.

PORT 8080 must be open outbound!

How can I even?

Prerequisites

1. Install netcat, speedtest and gron.

Debain/Ubuntu

2. curl -s https://packagecloud.io/install/repositories/ookla/speedtest-cli/script.deb.sh | sudo bash
sudo apt install speedtest gron ncat

RHEL/CentOS/Rocky/Apline

wget https://download-ib01.fedoraproject.org/pub/fedora/linux/releases/37/Everything/x86_64/os/Packages/g/gron-0.7.1-4.fc37.x86_64.rpm

sudo dnf install gron-0.7.1-4.fc37.x86_64.rpm curl -s https://packagecloud.io/install/repositories/ookla/speedtest-cli/script.rpm.sh | sudo bash

sudo dnf install speedtest netcat

 

3. You also need a functional Graylog instance with a GELF input running.

4. My speedtest script and Graylog content pack (contains dashboard, route rule and a stream)

  1. Grab the script
    wget https://raw.githubusercontent.com/graylog-labs/graylog-playground/main/Speed%20Test/speedtest.sh
  1. Move the script to a common location and make it executable
    mkdir /scripts
    mv speedtest.sh /scripts/
    chmod +x /scripts/speedtest.sh

Getting Started

  1. Login to your Graylog instance
  2. Navigate to System → Content Packs
  3. Click upload.
  4. Browse to the downloaded location of the Graylog content pack and upload it to your instance
  5. Install the content pack
  6. This will install a Stream, pipeline, pipeline rule (routing to stream) and dashboard
  7. Test out the script!
    1. ssh / console to your linux system hosting Graylog/docker
    2. Manually execute the script:
      /scripts/speedtest.sh localhost 12201
      Script Details: <path to script> <ip/dns/hostname> <port>
  1. Check out the data in your Graylog
    1. Navigate to Streams → Speed Tests
    2. Useful data appears!
    3. Navigate to Dashboards → ISP Speed Test
      1. Check out the data!
  2. Manually execute the script as much as you like. More data will appear the more you run it.

Automate the Script!

This is how I got the data to convince my ISP that something was actually wrong. Setup a CRON job that runs every 30 minutes and within a few day you should see some time related changes.

  1. ssh or console to your linux system hosting the script / Graylog
  2. Create a CRONTAB to run the script every 30 minutes
    1. create crontab (this will be for the currently logged in user OR root if sudo su was used)

crontab -e

    1. Set the script to run every 30 minutes (change as you like)

*/30 * * * * /scripts/speedtest.sh localhost 12201

  1. That’s it! As long as the user the crontab was made for has permissions, the script will run every 30 minutes and the data will go to Graylog . The dashboard will continue to populate for you automatically.

Bonus Concept – Monitor you Sites WAN Connection(s)

This same script could be used to monitor WAN connections at different sites. Without any extra fields, we could use the interface_externalIp or source fields provided by the speedtest cli/sending host to filter by site location, add a pipeline rule to add a field biased on a lookup table or add a single field to the speedtest GELF message (change the script slightly) to provide that in the original message, etc. Use my dashboard to make a new dashboard with tabs for per-site and a summary page! The possibilities are endless.

Most of all, go have fun!

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

Why API Discovery Is Critical to Security

For Star Trek fans, space may be the final frontier, but in security, discovering Application Programming Interfaces (APIs) could be the technology equivalent. In the iconic episode “The Trouble with Tribbles,” the legendary starship Enterprise discovers a space station that becomes overwhelmed by little fluffy, purring, rapidly reproducing creatures called “tribbles.” In a modern IT department, APIs can be viewed as the digital tribble overwhelming security teams.

 

As organizations build out their application ecosystems, the number of APIs integrated into their IT environments continues to expand. Organizations and security teams can become overwhelmed by the sheer number of these software “tribbles,” as undiscovered and unmanaged APIs create security blindspots.

 

API discovery is a critical component for any security program because it expands the organization’s attack surface.

 

What is API discovery?

API discovery is a manual or automated process that identifies, documents, and catalogs an organization’s APIs so that security teams can monitor the application-to-application data transfers. To manage all APIs that the organization integrated into its ecosystem, organizations need a comprehensive inventory that includes:

  • Internal APIs: interfaces between a company’s backend information and application functionality
  • External APIs: interfaces exposed over the internet to non-organizational stakeholders, like external developers, third-party vendors, and customers

 

API discovery enables organizations to identify and manage the following:

  • Shadow (“Rogue”) APIs: unchecked or unsupervised APIs
  • Deprecated (“Zombie”) APIs: unused yet operational APIs without the necessary security updates

 

What risks do undocumented and unmanaged APIs pose?

Threat actors can exploit vulnerabilities in these shadow and deprecated APIs, especially when the development and security teams have no way to monitor and secure them.

 

Unmanaged APIs can expose sensitive data, including information about:

  • Software interface: the two endpoints sharing data
  • Technical specifications: the way the endpoints share data
  • Function calls: verbs (GET, DELETE) and nouns (Data, Access) that indicate business logic

 

Why is API discovery important?

Discovering all your organization’s APIs enhances security by incorporating them into:

  • Risk assessments: enabling API vulnerability identification, prioritization, and remediation
  • Compliance: mitigate risks arising from accidental sensitive data exposures that lead to compliance violations, fines, and penalties
  • Vendor risk management: visibility into third-party security practices by understanding the services, applications, and environments that they can impact
  • Incident response: faster detection, investigation, and response times by understanding potential entry points, impacted services, and data leak paths
  • Policy enforcement: ensuring all internal and external APIs follow the company’s security policies and best practices
  • Training and awareness: providing appropriate educational resources for developers and IT staff

 

Beyond the security use case, API discovery provides these additional benefits:

  • Faster integrations by understanding available endpoints, methods, and data formats
  • Microservice architecture management by tracking services, health status, and interdependencies
  • Enhanced product innovation and value by understanding API capabilities and limitations
  • Increased revenue by understanding API usage

 

Using automation for API discovery

While developers can manually discover APIs, the process is expensive, inefficient, and risky. Manual API discovery processes are limited because they are:

  • Time-consuming: With the average organization integrating over 9,000 known APIs, manual processes for identifying unknown or unmanaged APIs can be overwhelming, even in a smaller environment.
  • Error-prone: Discovering all APIs, including undocumented ones and those embedded in code, can lead to incomplete discovery, outdated information, or incorrect documentation.
  • Resource-intensive: Manual discovery processes require manual inventory maintenance.

 

Automated tools make API discovery more comprehensive while reducing overall costs. Automated API discovery tools provide the following benefits:

  • Efficiency: Scanners can quickly identify APIs, enabling developers to focus on more important work.
  • Accurate, comprehensive inventory: API discovery tools can identify embedded and undocumented APIs, enhancing security and documentation.
  • Cost savings: Automation takes less time to scan for updated information, reducing maintenance costs.

 

 

What to look for in an API discovery tool

While different automated tools can help you discover the APIs across your environment, you should know the capabilities that you need and what to look for.

Continuous API Discovery

Developers can deliver new builds multiple times a day, continuously changing the API landscape and risk profile. For an accurate inventory and comprehensive visibility, you should look for a solution that scans:

  • All API traffic at runtime
  • Categorizes API calls
  • Sorts incoming traffic into domain buckets

For example, when discovering APIs by domain, the solution includes cases where:

  • Domains are missing
  • Public or Private IP addresses are used

With the ability to identify shadow and deprecated APIs, the solution should give you a way to add domains to the:

  • Monitoring list so you can start tracking them in the system
  • Prohibited list so that the domain should never be used

 

 

Vulnerability Identification

An API discovery solution that analyzes all traffic can also identify potential security vulnerabilities. When choosing a solution, you should consider whether it contains the following capabilities:

  • Captures unfiltered API request and response detail
  • Enhances details with runtime analysis
  • Creates an accessible datastore for attack detection
  • Identified common threats and API failures aligned to OWASP and MITRE guidance
  • Automatic remediation tops with actionable solutions that enable the teams to optimize critical metrics like Mean Time to Response (MTTR)

Risk Assessment and Scoring

Every identified API and vulnerability increases the organization’s risk. To appropriately mitigate risk arising from previously unidentified and unmanaged APIs, the solution should provide automated risk assessment and scoring. With visibility into the type of API and the high-risk areas that should be prioritized, Security and DevOps teams can focus on the most risky APIs first.

 

Graylog API Security: Continuous, Real-Time API Discovery

Graylog API Security is continuous API security, scanning all API traffic at runtime for active

attacks and threats. Mapped to security and quality rules, Graylog API Security captures

complete request and response details, creating a readily accessible datastore for attack

detection, fast triage, and threat intelligence. With visibility inside the perimeter,

organizations can detect attack traffic from valid users before it reaches their applications.

 

Graylog API Security captures details to immediately identify valid traffic from malicious

actions, adding active API intelligence to your security stack. Think of it as a “security

analyst in-a-box,” automating API security by detecting and alerting on zero-day attacks

and threats. Our pre-configured signatures identify common threats and API failures and

integrate with communication tools like Slack, Teams, Gchat, JIRA or via webhooks.

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

FERC and NERC: Cyber Security Monitoring for The Energy Sector

As cyber threats targeting critical infrastructure continue to evolve, the energy sector remains a prime target for malicious actors. Protecting the electric grid requires a strong regulatory framework and robust cybersecurity monitoring practices. In the United States, the Federal Energy Regulatory Commission (FERC) and the North American Electric Reliability Corporation (NERC) play key roles in safeguarding the power system against cyber risks.

 

Compliance with the NERC Critical Infrastructure Protection (NERC CIP) standards provides a baseline for mitigating security risk, but organizations should implement security technologies that help them streamline these processes.

Who are FERC and NERC?

The Federal Energy Regulatory Commission (FERC) is the governmental agency that oversees the power grid’s reliability. Since the Energy Policy Act of 2005 that granted FERC these powers, the rise of smart technologies across the energy industry expanded. This led to the Energy Independence and Security Act of 2007 (EISA) which led to FERC and the National Institute of Standards and Technology (NIST) to coordinate cybersecurity reliability standards that protect the industry.

 

However, to develop these reliability standards, FERC certified the North American Electric Reliability Corporation (NERC). Currently, NERC has thirteen published and enforceable Critical Infrastructure Protection (CIP) standards plus one more awaiting approval.

What are the NERC CIP requirements?

The cybersecurity Reliability Standards are broken out across nine documents, each detailing the different requirements and controls for compliance.

 

CIP-002: BES Cyber System Categorization

This CIP creates “bright-line” criteria for how to categorize BES Cyber Systems based on impact that an outage would cause. The publication separates BES Cyber Systems into three general categories:

  • High Impact
  • Medium Impact
  • Low Impact

 

CIP-003-8: Security Management Controls

This publication, with its most recent iteration being enforceable in April 2026, requires Responsible Entities to create policies, procedures, and processes for high or medium impact BES Cyber Systems, including:

  • Cyber security awareness: training delivered every 15 calendar months
  • Physical security controls: protections for assets, locations within an asset containing low impact BES systems, and Cyber Assets
  • Electronic access controls: controls that limit inbound and outbound electronic access for assets containing low impact BES Cyber Systems
  • Cyber security incident response: identification, classification, and response to Cyber Security incidents, including establishing role and responsibilities for testing (every 36 months) and handling incidents, including updating Cyber Security Incident response plan within 180 days of a reportable incident
  • Transient cyber asset and removable media malicious code risk mitigation: Plans for implementing, maintaining, and monitoring anti-virus, application allowlists, and other methods to detect malicious code
  • Vendor electronic remote access security controls: processes for remote access to mitigate risks, including ways to determine and disable remote access and detect known or suspected malicious communications from vendor remote access

 

CIP-004-7: Personnel & Training

Every Responsible Entity needs to have one or more documented processes and provide evidence to demonstrate implementation of:

  • Security awareness training
  • Personnel risk assessments prior to granting authorized electronic or unescorted physical access
  • Access management programs
  • Access revocation programs
  • Access management, including provisioning, authorizing, and terminating access

CIP-005-7: Electronic Security Perimeter(s)

To mitigate risks, Responsible Entities need to have controls for permitting known and controlled communications need documented processes and evidence of:

  • Connection to network using a routable protocol protected by an Electronic Security Perimeter (ESP)
  • Permitting and documenting the reasoning for necessary communications while denying all other communications
  • Limiting network accessibility to management Interfaces
  • Performing authentication when allowing remove access through dial-up connectivity
  • Monitoring to detect known or suspected malicious communications
  • Implementation of controls, like encryption or physical access restrictions, to protect data confidentiality and integrity
  • Remote access management capabilities, multi-factor authentication and multiple methods for determining active vendor remote access
  • Multiple methods for disabling active vendor remote access
  • One or more methods to determine authenticated vendor-initiated remote access, terminating these remote connections, and controlling ability to reconnect

 

Most of these requirements fall under the umbrella of network security monitoring. For example, many organizations will implement tools like:

 

Once organizations can define baselines for normal network traffic, they can implement detections that alert their security teams to potential incidents.

CIP-006-6: Physical Security of BES Cyber Systems

To prove management of physical access to these systems, Responsible Entities need documented processes and evidence that include:

  • Physical security plan with defined operation or procedural controls for restricting physical access
  • Controls for managing authorized unescorted access
  • Monitoring for unauthorized physical access
  • Alarms or alerts for responding to detected unauthorized access
  • Logs that must be retained for 90 days for managing entry of individuals authorized for unescorted physical access
  • Visitor control program that includes continuous escort for visitors, logging visitors, and retaining visitor logs
  • Maintenance and testing programs for the physical access control system

 

Many organizations use technologies to help manage physical security, like badges or smart alarms. By incorporating these technologies into the overarching cybersecurity monitoring, Responsible Entities can correlate activities across the physical and digital domains.

Example: Security Card Access in buildings showing entry and exit times.

 

By tracking both physical access and digital access to BES Cyber Systems, Responsible Entities can improve their overarching security posture, especially given the interconnection between physical and digital access to systems.

CIP-007-6: System Security Management

To prove that they have the technical, operational, and procedural system security management capabilities, Responsible Entities need documented processes and evidence that include:

  • System hardening: disabling or preventing unnecessary remote access, protection against physical input/output ports used for network connectivity, risk mitigation to prevent CPU or memory vulnerabilities
  • Patch management process: evaluating security patch applicability at least once every 35 calendar days and tracking, evaluating, and installing security patches
  • Malicious code prevention: methods for deterring, detecting, or preventing malicious code and mitigating the threat of detected malicious code
  • Monitoring for security events: logging security events per system capabilities, generating security event alerts, retaining security event logs, and reviewing summaries or samplings of logged security events
  • System access controls: authentication enforcement methods, identification and inventory of all known default or generic accounts, identification of people with authorized access to shared accounts, changing default passwords, technical or procedural controls for password-only authentication, including forced changes at least once every 15 calendar months, limiting the number of unsuccessful authentication attempts and generating

 

Having a robust threat detection and incident response (TDIR) solution enables Responsible Parties to leverage user and entity behavior analytics (UEBA) with the rest of their log data so they can handle security functions like:

  • Privileged access management (PAM)
  • Password policy compliance
  • Abnormal privilege escalation
  • Time spent accessing a resource
  • Brute force attack detection

 

CIP-008-6: Incident Reporting and Response Planning

To mitigate risk to reliable operation, Responsible Entities need documented incident response plans and evidence that include:

  • Processes for identifying, classifying, and responding to security incidents
  • Roles and responsibility for the incident response groups or individuals
  • Incident handling procedures
  • Testing incident response plan at least once every 15 calendar months
  • Retaining records for reportable and other security incidents
  • Reviewing, updating, and communicating lessons learned, changes to the plan based on lessons learned, notifying people of changes

 

Security analytics enables Responsible Entities to enhance their incident detection and response capabilities. By building detections around MITRE ATT&CK tactics, techniques, and procedures (TTPs), security teams can connect the activities occurring in their environments with real-world activities to investigate an attacker’s path faster. Further, with high-fidelity Sigma rule detections aligned to the ATT&CK framework, Responsible Entities improve their incident response capabilities.

 

In the aftermath of an incident or incident response test, organizations need to develop reports that enable them to identify lessons learned. These include highlighting:

  • Key findings
  • Actions taken
  • Impact on stakeholders
  • Incident ID
  • Incident summary that includes type, time, duration, and affected systems/data

 

To improve processes, Responsible Entities need to organize the different pieces of evidence into an incident response report that showcases the timeline of events.

 

Further, they need to capture crucial information about the incident, including:

  • Nature of threat
  • Business impact
  • Immediate actions taken
  • When/how incident occurred
  • Who/what was affected
  • Overall scope

 

CIP-009-6: Recovery Plans for BES Cyber Systems

To support continued stability, operability, and reliability, Responsible Entities need documented recovery plans with processes and evidence for:

  • Activation of recovery plan
  • Responder roles and responsibilities
  • Backup and storage of information required for recovery and verification of backups
  • Testing recovery plan at least once every 15 calendar months
  • Reviewing, updating, and communicating lessons learned, changes to the plan based on lessons learned, notifying people of changes

 

CIP-010-4: Configuration Change Management and Vulnerability Assessments

To prevent and detect unauthorized changes, Responsible Entities need documentation and evidence of configuration change management and vulnerability assessment that includes:

  • Authorization of changes that can alter behavior of one or more cybersecurity controls
  • Testing changes prior to deploying them in a production environment
  • Verifying identity and integrity of operating systems, firmware, software, or software patches prior to installation
  • Monitoring for unauthorized changes that can alter the behavior of one or more cybersecurity controls at least once every 35 calendar days, including at least one control for configurations affecting network accessibility, CPU and memory, installation, removal, or updates to operating systems, firmware, software, and cybersecurity patches, malicious code protection, security event logging or alerting, authentication methods, enabled or disabled account status
  • Engaging in vulnerability assessment at least once every 15 calendar months
  • Performing an active vulnerability assessment in a test environment and documenting the results at least once every 36 calendar months
  • Performing vulnerability assessments for new systems prior to implementation

 

CIP-011-3: Information Protection

To prevent unauthorized access, Responsible Entities need documented information protection processes and evidence of:

  • Methods for identifying, protecting, and securely handling BES Cyber System Information (BCSI)
  • Methods for preventing the unauthorized retrieval of BCSI prior to system disposal

CIP-012-1: Communications between Control Centers

To protect the confidentiality, integrity, and availability assessment monitoring data transmitted between Control Centers, Responsible Entities need documented processes for and evidence of:

  • Risk mitigation for unauthorized disclosure and modification or loss of availability of data
  • Identification of risk mitigation methods
  • Identification of where methods are implemented
  • Assignment of responsibilities when different Responsible Entities own or operate Control Centers

 

To mitigate data exfiltration risks, Responsible Parties need to aggregate, correlate, and analyze log data across:

  • Network traffic logs
  • Antivirus logs
  • UEBA solutions

 

With visibility into abnormal data downloads, they can more effectively monitor communications between control centers.

 

CIP-013-2: Supply Chain Risk Management

To mitigate supply chain risks, Responsible Entities need documented security controls and evidence of:

  • Procurement processes for identifying and assessing security risks related to installing vendor equipment and software and switching vendors
  • Receiving notifications about vendor-identified incidents related to products or services
  • Coordinating responses to vendor-identified incidents related to products or services
  • Notifying vendors when no longer granting remote or onsite access
  • Vendor disclosure of known vulnerabilities related to products or services
  • Verifying software and patch integrity and authenticity
  • Coordination controls for vendor-initiated remote access
  • Review and obtain approval for the supply chain risk management plan

 

CIP-015-1: Internal Network Security Monitoring

While this standard is currently awaiting approval by the NERC Board of Trustees, Responsible Entities should consider preparing for publication and enforcement with documented processes and evidence of monitoring internal networks’ security, including the implementation of:

  • Network data feeds using a risk-based rationale for monitoring network activity, including connections, devices, and network communications
  • Detections for anomalous network activity
  • Evaluating anomalous network activity
  • Retaining internal network security monitoring data
  • Protecting internal network security monitoring data

 

Graylog Security: Enabling the Energy Sector to Comply with NERC CIP

Using Graylog Security, you can rapidly mature your TDIR capabilities without the complexity and cost of traditional Security Information and Event Management (SIEM) technology. Graylog Security’s Illuminate bundles include detection rulesets so that you have content, like  Sigma detections, enabling you to uplevel your security alert, incident response, and threat hunting capabilities with correlations to ATT&CK tactic, techniques, and procedures (TTPs).

By leveraging our cloud-native capabilities and out-of-the-box content, you gain immediate value from your logs. Our anomaly detection ML improves over time without manual tuning, adapting rapidly to new data sets, organizational priorities, and custom use cases so that you can automate key user and entity access monitoring.

With our intuitive user interface, you can rapidly investigate alerts. Our lightning-fast search capabilities enable you to search terabytes of data in milliseconds, reducing dwell times and shrinking investigations by hours, days, and weeks.

To learn how Graylog Security can help you implement robust threat detection and response, contact us today.

 

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

Security Misconfigurations: A Deep Dive

Managing configurations in a complex environment can be like playing a game of digital Jenga. Turning off one port to protect an application can undermine the service of a connected device. Writing an overly conservative firewall configuration can prevent remote workforce members from accessing an application that’s critical to getting their work done. In the business world that runs on Software-as-a-Service (SaaS) applications and the Application Programming Interfaces (APIs) that allow them to communicate, a lot of your security is based on the settings you use and the code that you write.

 

Security misconfigurations keep creeping up the OWASP Top 10 Lists for applications, APIs, and mobile devices because they are security weaknesses that can be difficult to detect until an attacker uses them against you. With insight into what security misconfigurations are and how to mitigate risk, you can create the programs and processes that help you protect your organization.

What are Security Misconfigurations?

Security misconfigurations are insecure default settings that remain in place during and after system deployment. They can occur anywhere within the organization’s environment because they can arise from:

  • Operating systems
  • Network devices their settings
  • Web servers
  • Databases
  • Applications

 

Organizations typically implement hardening across their environment by changing settings to limit where, how, when, and with whom technologies communicate. Some examples of security misconfigurations may include failing to:

  • Disable or uninstall unnecessary features, like ports, services, accounts, API HTTP verbs, API logging features
  • Change default passwords
  • Limit the information that error messages send to users
  • Update operating systems, software, and APIs with security patches
  • Set secure values for application servers, application frameworks, libraries, and databases
  • Use Transport Layer Security (TLS) for APIs
  • Restrict Cross-Origin resource sharing (CORS)

 

Security Misconfigurations: Why Do They Happen?

Today’s environments consist of complex, interconnected technologies. While all the different applications and devices make business easier, they make security configuration management far more challenging.

 

Typical reasons that security misconfigurations happen include:

  • Complexity: Highly interconnected systems can make identifying and implementing all possible security configurations difficult.
  • Patches: Updating software and systems can have a domino effect across all interconnected technologies that can change a configuration’s security.
  • Hardware upgrades: Adding new servers or moving to cloud can change configurations at hardware and software level.
  • Troubleshooting: Fixing a network, application, or operating system issue to maintain service availability may impact other configurations.
  • Unauthorized changes: Failing to follow change management processes for adding new technologies or fixing issues can impact interconnections, like users connecting corporate email to authorize API access for an unsanctioned web application.
  • Poor documentation: Failure to document baselines and configuration changes can lead to lack of visibility across the environment.

Common Types of Security Misconfiguration Vulnerabilities

To protect your systems against cyber attacks, you should understand what some common security misconfigurations are and what they look like.

  • Improperly Configured Databases: overly permissive access rights or lack of authentication
  • Unsecured Cloud Storage: lack of encryption or weak access controls
  • Default or Weak Passwords: failure to change passwords or poor password hygiene leading to credential-based attacks
  • Misconfigured Firewalls or Network Settings: poor network segmentation, permissive firewall settings, open ports left unsecured
  • Outdated Software or Firmware: failing to install software, firmware, or API security updates or patches that fix bugs
  • Inactive Pages: failure to include noopener or noreferrer attributes in a website or web application
  • Unneeded Services/Features: leaving network services available and ports open, like web servers, file share servers, proxy servers FTP servers, Remote Desktop Protocol (RDP), Virtual Network Computing (VNC), and Secure Shell Protocol (SSH)
  • Inadequate Access Controls: failure to implement and enforce access policies that limit user interaction, like the principle of least privilege for user access, deny-by-default for resources, or lack of API authentication and authorization
  • Unprotected Folders and Files: using predictable, guessable file names and locations that identify critical systems or data
  • Improper error messages: API error messages returning data such as stack traces, system information, database structure, or custom signatures

Best Practices for Preventing Security Misconfiguration Vulnerabilities

As you connect more SaaS applications and use more APIs, monitoring for security misconfigurations becomes critical to your security posture.

Implement a hardening process

Hardening is the process of choosing the configurations for your technology stack that limit unauthorized external access and use. For example, many organizations use the CIS Benchmarks that provide configuration recommendations for over twenty-five vendor product families. Organizations in the Defense Industrial Base (DIB) use the Department of Defense (DoD) Security Technical Implementation Guides (STIGs).

 

Your hardening processes should include a change management process that:

  • Sets and documents baselines
  • Identifies changes in the environment
  • Reviews whether changes are authorized
  • Allows, blocks, or rolls back changes as appropriate
  • Updates baselines and documentation to reflect allowed changes

Implement a vulnerability management and remediation program

Vulnerability scanners can identify common vulnerabilities and exposures (CVEs) on network-connected devices. Your vulnerability management and remediation program should:

  • Define critical assets: know the devices, resources, and users that impact the business the most
  • Assign ownership: identify the people responsible for managing and updating critical assets
  • Identify vulnerabilities: use penetration tests, red teaming, and automated tools, like vulnerability scanners
  • Prioritize vulnerabilities: combine a vulnerability’s severity and exploitability to determine the ones that pose the highest risk to the organization’s business operations
  • Identify and monitor key performance indicators (KPIs): set metrics to determine the program’s effectiveness, including number of assets managed, number of assets scanned per month, frequency of scans, percentage of scanned assets containing vulnerabilities, percentage of vulnerabilities fixed within 30, 60, and 90 days

 

Monitor User and Entity Activity

Security misconfigurations often lead to unauthorized access. To mitigate risk, you should implement best authentication, authorization, and access practices that include:

  • Multifactor Authentication: requiring users to provide two or more of the following: something they know (password), something they have (token/smartphone), or something they are (fingerprint or face ID)
  • Role-based access controls (RBAC): granting the least amount of access to resources based on their job functions
  • Activity baselines: understanding normal user and entity behavior to identify anomalous activity
  • Monitoring: identifying activity spikes like file permission changes, modifications, and deletions across email servers, webmail, removable media, and DNS

 

Implement and monitor API Security

APIs are the way that applications talk to one another, often sharing sensitive data. Many companies struggle to manage the explosion of APIs that their digital transformation strategies created, creating security weaknesses that attackers seek to exploit. To mitigate these risks, you should implement a holistic API security monitoring program that includes:

  • Continuously discovering APIs across the environment
  • Scanning all API traffic at runtime
  • Categorizing API calls
  • Sorting API traffic into domain buckets
  • Automatically assessing risk
  • Prioritizing remediation action using context that includes activity and intensity
  • Capturing unfiltered API request and response details

 

 

Graylog Security and Graylog API Security: Helping Detect and Remediate Security Misconfigurations

Built on the Graylog Platform, Graylog Security gives you the features and functionality of a SIEM while eliminating the complexity and reducing costs. With our easy to deploy and use solution, you get the combined power of centralized log management, data enrichment and normalization, correlation, threat detection, incident investigation, anomaly detection, and reporting.

 

Graylog API Security is continuous API security, scanning all API traffic at runtime for active attacks and threats. Mapped to security and quality rules like OWASP Top 10, Graylog API Security captures complete request and response detail, creating a readily accessible datastore for attack detection, fast triage, and threat intelligence. With visibility inside the perimeter, organizations can detect attack traffic from valid users before it reaches their applications.

 

With Graylog’s prebuilt content, you don’t have to worry about choosing the server log data you want because we do it for you. Graylog Illuminate content packs automate the visualization, management, and correlation of your log data, eliminating the manual processes for building dashboards and setting alerts.

 

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

Graylog Parsing Rules and AI Oh My!

In the log aggregation game, the biggest difficulty you face can be setting up parsing rules for your logs. To qualify this statement: simply getting log files into Graylog is easy. Graylog also has out-of-the-box parsing of a wide variety of common log sources, so if your logs fall into one of the many categories of log for which there is either a dedicated Input; a dedicated Illuminate component; or that uses a defined Syslog format; then yes, parsing logs is also easy.

  The challenge arises when you have a log source that does not neatly fall into one of these parsed out-of-the-box categories. A Graylog Raw/Plaintext input will accept just about any log format you can find, so getting the message into Graylog without parsing isn’t hard. The difficulty is usually then turning your message from being a block of raw text that looks like this:   Into a useful array of fields that can be searched and aggregated, like this: It is difficult to provide a step by step process on how to parse a log message. Log messages do not obligingly follow a widely agreed-upon format. Developers often make up their own log formats on the fly, and don’t necessarily do so with a lot of thought to how easy it will be to parse later. It follows that the process of breaking log messages down into fields is usually bespoke. It is a common joke in the field that even as technology gets better, parsing data that can be given in a wide array of different formats – in particular, timestamps –  remains very challenging.   Since there is no one-size-fits-all approach, and we understand that you are too good-looking and busy for an exhaustive manual on every single approach to parsing, this guide will instead just try to provide useful quick examples and links to the primary methods of parsing logs. We will assume in all the subsequent examples, that the text that needs parsing is in the $message.message field – when lifting Pipeline rules from this guide, remember to replace this field in the code block with the field from which you are trying to parse text.

1. Look for Delimiters

Fields that are consistently separated by a delimiter – a comma, a pipe, a space – are very easy to parse.For example, the message:
Graylog 100 awesome
Let’s say this message lists a software; its review score; and one word review summary. The following pipeline rule will parse named fields out of the contents of $message.message (eg. the message field), delimited by a “ “ (a space). Changing the character within those speech marks allows you to delimit by other characters. The fields are extracted (and so named) in the order they appear.
Rule "Parse fields from message"
when   
true
then
    let pf = split(
           pattern: " ",
           value: to_string($message.message)
           );
set_field("fieldname_1",pf[0]);
set_field("fieldname_2",pf[1]);
set_field("fieldname_3",pf[2]);

end
For example, if the message field is currently “Graylog 100 awesome”, this rule would create three new fields with the current values: fieldname_1: “Graylog” fieldname_2: “100” fieldname_3: “awesome” Very easy! We can also change the delimiter to be “,” or “, “ or “|” as needed by changing the value in the pattern field. Now, sometimes a message is very nearly consistently separated by a delimiter, but there are some annoying junk characters messing the parsing up. For those cases, here is an example of the same pipeline rule, but which first removes any annoying square bracket characters from the message, before then parsing it into space delimited fields.
rule "Parse fields from message"
when   
true
then

    let cleaned = to_string($message.message);
    let cleaned = regex_replace(

           pattern: "^\\[|\\]$",
           value: cleaned,
           replacement: ""
   );
    let pf = split(
           pattern: " ",
           value: to_string(cleaned)
           );
set_field("fieldname_1",pf[0]);
set_field("fieldname_2",pf[1]);
set_field("fieldname_3",pf[2]);

end
This technique of “cleaning” values from messages before parsing can of course be copy-pasted to act before any other parsing method.

2. Look for Key Value Pairs

Messages that consist of a list of key value pairs are also very easy to parse. For example, the message:
fieldname_1=graylog fieldname_2=100 fieldname_3=awesome
Key Value Pairs is also the extraction method you would employ if the contents of $message.message (eg. the message field) looked like this:
“fieldname_1”=”graylog” “fieldname_2”=”100” “fieldname_3”=”awesome“
Or like this:
fieldname_1=’graylog’,fieldname_2=’100’,fieldname_3=’awesome’ Or like this:“fieldname_1”,”graylog” “fieldname_2”,”100” “fieldname_3”,”awesome“
Any consistent format that lists a field name followed by a value is a good target for this parsing approach. There is a nice Graylog Blog post that talks about Key Value Pair extraction in great detail here and documentation on using the function here. For the reader who is too executive to have time to read a whole blog post right now, here is a pipeline rule that would parse that last example (observe that we are trimming the “ characters from both the key and values, and that “ has to be escape-character-ed to be \”):
rule “key_value_parser”

when
true
then
set_fields(
   fields:key_value(
   value: to_string($message.message),
   trim_value_chars: "\"",
   trim_key_chars:"\"",
   delimiters:" ",
   kv_delimiters:","
)
);
end
This rule would again create three new fields with the current values: fieldname_1: “Graylog” fieldname_2: “100” fieldname_3: “awesome”

3. Look for JSON Format

JSON formatted messages are easily recognized from their structured organization of brackets and commas. JSON logs work nicely with Graylog, since the format provides not only the values but also the field names. Graylog can parse JSON format logs very simply using JSON flattening, which is detailed in the Graylog documentation here. If we take the below JSON message as an example:
{
   "type": "dsdbChange",
   "dsdbChange": {
       "version": {
           "major": 1,
           "minor": 0
       },
       "statusCode": 0,
       "status": "Success",
       "operation": "Modify",
       "remoteAddress": null,
       "performedAsSystem": false,
       "userSid": "S-1-5-18",
       "dn": "DC=DomainDnsZones,DC=XXXXX,DC=XXXX,DC=com",
       "transactionId": "XXXX-XXXX-XXXX-XXXX",
       "sessionId": "XXXX-XXXX-XXXX-XXXX",
       "attributes": {
           "repsFrom": {
               "actions": [{
                   "action": "replace",
                   "values": [{
                       "base64": true,
                       "value": "SOMELONGBASE64ENCODEDVALUE"
                   }]
               }]
           }
       }
   }
}
We can parse this effortlessly with a generic JSON parsing Pipeline Rule, below:
rule "JSON FLATTEN"
when
   true
then
   let MyJson = flatten_json(value: to_string($message.message), array_handler: "flatten", stringify: false);
   set_fields(to_map(MyJson));
end
This will parse all the fields out of the JSON structure, fire and forget.

4. Look for a consistent format for Grok

OK, so your logs don’t follow a format that Graylog can parse out-of-the-box, are not consistently delimited, are not set up in key value pairs, are not in a JSON format. But the format is at least consistent, even if the way the fields are broken up maybe isn’t. There is a structure here that we can parse using Grok. For example, the message: 2023-02-22T09:29:22.512-04:00   XXX.XXX.XXX.XXX  <179>50696: Feb 22 13:29:22.512: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/11, changed state to down This log format is all over the place with delimitation of fields, but there is still a consistent pattern of fields we can see: timestamp, ip_address, priority, process_id, event_timestamp, interface_name, interface_state. In this situation, the easiest way to extract these fields is to use Grok. You can read more about using Grok within a Pipeline Rule in the Graylog documentation here. Grok might look a bit intimidating, but it’s actually pretty easy once you get started. Online Grok de-buggers, such as this one, are your best friend when writing a Grok rule. The key to writing Grok is to focus on capturing one word at a time before trying to capture the next, and to remember that whitespace – including trailing whitespace, which often catches people out – is included in the pattern. Here is the Grok to parse this message: %{TIMESTAMP_ISO8601:timestamp}\s+%{IPORHOST:ip_address}\s+<%{NUMBER:priority}>%{NUMBER:process_id}: %{MONTH:month}\s+%{MONTHDAY:day}\s+%{TIME:time}: %{GREEDYDATA:interface_name}: %{GREEDYDATA:interface_state} Seen here in the Grok debugger https://grokdebugger.com/ in which it was written:   Once you have a Grok pattern that works – and check it against multiple examples of the log message, not just on one, to make sure it works consistently – the next step is to convert your Grok pattern into a Graylog Pipeline Rule. Note that all escape characters within your Grok string need to be prefaced with a \, including “\”. Here is the pipeline rule for parsing the message field using this Grok rule:
rule "Parse Grok"
when
   true
then
let MyGrok = grok(
   Pattern: "%{TIMESTAMP_ISO8601:timestamp}\\s+%{IPORHOST:ip_address}\\s+<%{NUMBER:priority}>%{NUMBER:process_id}: %{MONTH:month}\\s+%{MONTHDAY:day}\\s+%{TIME:time}: %{GREEDYDATA:interface_name}: %{GREEDYDATA:interface_state}",
   value: to_string($message.message),
   only_named_captures: true
);
set_fields(
   fields: MyGrok
);
end

5. Nothing is consistent? Time for Regex

If the field you need to extract from your data is really inconsistently placed, and none of these techniques are useful, then it’s probably time to write some Regex. Regex can be used in Pipeline Rules much the same as Grok, though it is better suited to scalpelling out a single tricky field than trying to parse a whole message into fields. There is a Graylog Documentation page on using Regex in Pipeline Rules here. Regex is especially useful when capturing errors or stacktraces, which can blow out to many lines of text and otherwise confuse your parsers. For example, the message:
26/03/2023 08:03:32.207 ERROR:  Error in EndVerifySealInBatch()Rep.dingo.Library.Serialisation.dingoHelperException: The exception has occured in one of the dingo Helper classes: ISL_LINK                
Server stack trace:
   at Rep.dingo.Library.Serialisation.DataFrame.VerifySeal(dingoSecurity2 itsSecure, Boolean dyeISRN, Byte[]& native, shipmentType shipmentType)
   at Rep.dingo.Library.MessageProcessor.Incoming.Class1Handler.AsyncVerifySeal(Boolean decryptIsrn, DataFrame df, Byte[]& dfNative)
   at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Int32 methodPtr, Boolean fExecuteInContext, Object[]& outArgs)
   at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)
Exception rethrown at [0]:
   at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
   at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData)
   at Rep.dingo.Library.MessageProcessor.Incoming.Class1Handler.AsyncVerifySealDelegate.EndInvoke(Byte[]& dfNative, IAsyncResult result)
   at Rep.dingo.Library.MessageProcessor.Incoming.Class1Handler.EndVerifySealInBatch()
If you want to capture the first 3 words after the first occurrence of “ERROR” in your log message, you could use a Regex rule. We would highly recommend the free online Regex tool available at https://regex101.com/ for the purposes of composing your Regex. In this example, the Regex rule would be: [E][R][R][O][R].\s+(\S+\s\S+\s\S+) This would capture the value “Error in EndVerifySealInBatch()Rep.dingo.Library.Serialisation.dingoHelperException:”   Once your Regex rule is working in https://regex101.com/ then it is time to put it into a Graylog Pipeline Rule. Note that all escape characters within your Regex string need to be prefaced with a \, including “\”.Here is the Pipeline Rule for capturing the first 3 words after the first occurrence of “error” in the message field using this Regex rule:
rule "Regex field extract"
when
true
then
 let MyRegex = regex("[E][R][R][O][R].\\s+(\\S+\\s\\S+\\s\\S+)", to_string($message.message));
 set_field("MyFieldname_1", x["0"]);

end
This rule would create a new field with the current value: MyFieldname_1: “Error in EndVerifySealInBatch()Rep.dingo.Library.Serialisation.dingoHelperException:” Very cool!

6. Stuck? Look for Extractors in the Graylog Marketplace

Extractors are a legacy feature of Graylog, providing an interface for extracting fields from messages hitting an input using Regex. We recommend against creating your parsing rules using the Extractors interface, as it is rather fiddly and outdated. You can read more about Extractors and how they work in the legacy Graylog Documentation here. Extractors have been around for many years, so there is A merit to continuing to use this functionality: the Graylog Open community has created a lot of useful Extractor Parsing rules over the years, and these are all available to download from the Graylog Marketplace. If you require a parser for the complex logs of a common hardware device or software suite, it can be worth checking if the Graylog Open Community has already produced them. Work smarter not harder: downloading someone else’s ready-made parser is often quicker than writing your own 😎 Be mindful however that this option is presented late in this guide because it is something of a last resort. Extractors are a vestigial mechanism, and being community written and maintained, carry no guarantee on being correct, up to date, or even working. There will often be a bit of TLC required to get such content working and up to date.

7. Stuck? ChatGPT can write both Graylog Pipeline Rules and GROK/Regex Parsing… sometimes.

Technology is a beautiful thing! ChatGPT, the AI that needs no introduction, can write Graylog Pipeline rules. It can also write GROK or Regex parsers – just paste in your log sample and ask nicely. This is really useful in theory and can often point you in the right direction, but be warned that in practice, the AI will make various mistakes. Rather than entering your requests into ChatGPT directly, we recommend checking out this useful Community tool that leverages OpenAI’s GPT API and an extensive prompt designed to improve results. https://pipe-dreams.vercel.app/   AI is far from perfect at these tasks at this stage, but still very useful – particularly at showing syntax and structure. Please note the tabs on the top left that switch between Pipeline and GROK parsing modes.  

8. I am still stuck – Parsing logs is hard!

Yes, parsing logs can be hard. If you really get stuck, and you still can’t parse your logs, there are several avenues for assistance you might pursue.
  • If your log message is from a common network hardware device or a software suite with a security focus, maybe we can write it for you! Graylog has a standing offer to create parsing rules for Enterprise Customers in these circumstances, for free and within 30 days. Simply provide the device model, the firmware version, and a sample log file (sanitize it first of course) containing at least 20 lines of log text to Graylog Support, and we will seek to include parsing rules for your device in a subsequent release of Illuminate.
  • Ask for help on the Graylog Community Forums. People do this for fun!
  • For Enterprise Customers, ask for help with a specific rule that you can’t get working from Graylog Support. Graylog Support cannot write your parsers for you, but they are more than happy to point out where you might be going wrong if you can provide them with the Pipeline Rule in question.
  • For Enterprise Customers, ask your Customer Success Manager about a Graylog Professional Services Engagement. Professional Services are not free, but it never hurts to have the option to call in the experts for a day to write your parsing rules, should you need it!
 

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

Getting Ready with Regex 101

If you’ve dropped your house key in tall grass, you know how difficult it is to locate a small item hiding in an overgrown field. Perhaps, you borrowed a metal detector from a friend, then returned to the field hoping to get the loud beep that indicates finding metal in an otherwise organic area.

 

Trying to find patterns in strings of data is the same process. However, instead of using a physical object, you use a regular expression (regex) to search for the key patterns that would find the data elements you want.

 

While regex is a well-known syntax across various programming languages, having an understanding of what it is and how to use it can help you be more efficient when trying to match patterns or manipulate strings.

 

What does regex mean?

Regex is short for regular expression, a specialized syntax for defining search patterns when matching and manipulating strings. Unlike simple wildcards, regex offers advanced capabilities that allow for flexible definitions to create narrow or broad searches across:

  • Data filters
  • Key event
  • Segments
  • Segments
  • Audiences
  • Content groups

 

A regular expression engine processes the regex partners, performing the search, replacement, and validation. However, since regex is not limited to a single programming language, the regular expression engine for a specific language may have its own unique requirements.

 

The core components include:

  • Atoms: elements within the expressions
  • Metacharacters: definitions of grouping, quantification, and alternatives
  • Anchors: starting and ending points for a string or line
  • Character classes: specific characters defined within a search pattern
  • Quantifiers: number of characters or character classes to be matched
  • Alternation: number of possible search patterns to be matched

 

What is a regex function used for?

Regex syntax is part of standard programming libraries so that programmers can define compact search patterns. Some typical uses include:

  • Pattern matching: identifying substrings within input strings that fit defined patterns
  • Search and replace: modifying strings by replacing the matched patterns with replacement strings
  • Validation: reviewing to ensure that input strings follow defined formats
  • Data extraction: retrieving data points from large bodies of text
  • Parsing: breaking strings into their components

 

Writing a Regular Expression

At their core, a regex pattern is a sequence of atoms, where each atom represents a single point that the regex engine attempts to match in a target string. These patterns can range from simple literal characters to complex formations involving grouping symbols, quantifiers, logical operators, and backreferences. Many tools are available to debug your regex patterns.

Simple Patterns

Some regex patterns typically require a precise match for defined characters. For example, here are a few common text and data structures and some regex patterns:

Regex Patterns Table

Pattern NameRegexMatches
Email Address[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}user@example.com, test.email@domain.co, hello-world123@my-site.net
Match a U.S. Phone Number\(\d{3}\) \d{3}-\d{4}(123) 456-7890, (987) 654-3210
Match IPV4 IP Addresses\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}192.168.1.1, 255.255.255.0

 

Escaping

Escaping in regex uses a backslash (\) to treat special characters as literals, ensuring they are interpreted correctly by the regex engine. Escaping is only necessary at the string literal level when dealing with characters that have special meanings, such as . or *. However, for a simple string like “ily”, no escaping is required since it contains no special characters.

Special characters

Special characters in regex provide additional matching capabilities beyond literal sequences.

For example, if you want to match a Windows file path that starts with C:\, you need to properly escape the backslash (\) since it is a special character in regex. The correct regex pattern would be C:\\ to match C:\ exactly. If you want to match a full file path like C:\Users\jdarr\Documents, the regex would be C:\\Users\\jdarr\\Documents. Similarly, if you want to match a file extension (e.g., .txt), you must escape the period as \.txt, since . is a wildcard in regex.

Parentheses

Parentheses in regex are primarily used to create capturing groups, which allow specific parts of a match to be referenced later. This is particularly useful for backreferences and substitutions. For example, in the regex pattern (\d{3})-\1, the first (\d{3}) captures a three-digit number, and \1 ensures that the same number appears again, matching values like 123-123 but not 123-456. If you need to group elements without capturing them, you can use non-capturing groups with (?:…), which helps structure complex patterns without affecting backreferences.

Matching characters

Most characters in regex match themselves, meaning that the pattern test searches for this exact sequence within strings. Combining literal characters and metacharacters allows you to create more complex patterns for matching, like defining case sensitivity for letters.

Repeating Things

Using metacharacters, you can create more complex searches that also allow for repetition within a sequence.  The * metacharacter signifies that the preceding character can match zero or more times, while + ensures one or more matches.

Using Regular Expressions

Regex engines expand upon these fundamentals so that you can more easily manipulate text and search within your programming and data processing.

Programming Languages

While you can use regex with any programming language, you should be aware that each language has its own idiosyncrasies. For example, if you have a working regex in Python then try to convert it to Java, you can have issues arising from the different implementations.

The Backslash Plague

When you use the backslash as an escape character, you can have a long list of backslashes that make the expression more complex. For example, if you use the backslash as a string literal and an escape, then the expression typically requires double escaping (\\). As your expressions get longer, you can lose track of the number of backslashes necessary which can impact the ability to match.

Match and replace

Regex patterns are integral for searching specific sequences within input strings. These methods, versatile with optional parameters, enhance the capability for fine-tuned searching, validation, and replacements.

 

For example, if you want to match a pattern to replace sensitive information in a log, you might want to use a regex expression like:
regex_replace(pattern: string, value: string, replacement: string,[replace_all: boolean])

 

To replace a person’s name with an anonymous identifier, you might write this:
// message = ‘logged in user: mike’
let username = regex_replace(“.*user: (.*)”, to_string($message.message), “$1”);
// message = ‘logged in user: mike’
let string = regex_replace(“logged (in|out) user: (.*)”, to_string($message.message), “User $2 is now logged $1”);`

 

Graylog Enterprise: Getting the Most from Your Logs

With Graylog Enterprise, you get built-in content that allows you to rapidly parse, normalize, and analyze your log data, optimizing your data’s value without requiring specialized skills. Graylog Enterprise is built to help transform your IT infrastructure into an optimized, secure, and compliant powerhouse.

 

With Graylog, you can build and configure pipeline rules using structured “when, then” statements to define conditions and actions. Using functions, pre-defined methods for performing specific action on log messages during processing, you can define parameters that return a value. Within the list of Graylog functions, you can use regex functions for partner matching with Java syntax, reducing the learning curve as you build your rules.

 

To learn how Graylog can improve your operations and security, contact us today for a demo.

About Graylog 
At Graylog, our vision is a secure digital world where organizations of all sizes can effectively guard against cyber threats. We’re committed to turning this vision into reality by providing Threat Detection & Response that sets the standard for excellence. Our cloud-native architecture delivers SIEM, API Security, and Enterprise Log Management solutions that are not just efficient and effective—whether hosted by us, on-premises, or in your cloud—but also deliver a fantastic Analyst Experience at the lowest total cost of ownership. We aim to equip security analysts with the best tools for the job, empowering every organization to stand resilient in the ever-evolving cybersecurity landscape.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.