Overview
This blog post is about exporting Defender for Cloud vulnerabilities and running them through OpenAI to enrich the data so we are able to get more out of it. A lot of times in the Defender for Cloud portal you can click on a recommendation and it will give the description and how to remediate. By running it through OpenAI API we can provide better descriptions, remediation, and even get the potential impact of someone exploiting it. I will put a link to the GitHub on the bottom of this post.
I have been learning Golang (Go) because I want to learn a language where code can get compiled into binary and is a little faster to run then Python. Go seemed like a good option since a lot of services in the cloud are built with Go. Because of these things this program is written in Go and it is honestly pretty straightforward. I am not all that good at programming so none of the scripts/programs I make are very complex, most of them just have something to do with grabbing API data, sorting it, and doing something with it. This is exactly what this program does.
Defender Assessment API
There are two main APIs that this program uses to grab data. The first is the defender Assessments – List API. This API gets all security assessments of the scanned resources in your scope.
https://learn.microsoft.com/en-us/rest/api/defenderforcloud-composite/assessments/list?view=rest-defenderforcloud-composite-stable&tabs=HTTP
Here is an example of a response from it that I got from the link above:
{
"value": [
{
"id": "/subscriptions/20ff7fc3-e762-44dd-bd96-b71116dcdc23/resourceGroups/myRg/providers/Microsoft.Compute/virtualMachineScaleSets/vmss1/providers/Microsoft.Security/assessments/21300918-b2e3-0346-785f-c77ff57d243b",
"name": "21300918-b2e3-0346-785f-c77ff57d243b",
"type": "Microsoft.Security/assessments",
"properties": {
"resourceDetails": {
"source": "Azure",
"id": "/subscriptions/20ff7fc3-e762-44dd-bd96-b71116dcdc23/resourceGroups/myRg/providers/Microsoft.Compute/virtualMachineScaleSets/vmss1"
},
"displayName": "Install endpoint protection solution on virtual machine scale sets",
"status": {
"code": "Healthy",
"statusChangeDate": "2021-04-12T09:07:18.6759138Z",
"firstEvaluationDate": "2021-04-12T09:07:18.6759138Z"
}
}
}
In this response we are interested in a couple things, the first is the name. The name is the ID of the assessment and this will be used later. The next is the display name. This gives a quick description of the assessment and what it is. The last is the status of the assessment. NA and healthy means we don’t need it, but if it is unhealthy, that means there is a resource in our environment that failed the assessment and is potentially vulnerable.
It is important to note that a lot of these assessments don’t take into account the context of the environment, it is only a pass fail type of assessment. So even though some of these are unhealthy and failed, there might be a reason for that. Maybe some mitigating control somewhere else that is in place or maybe the resource is just required to be configured a certain way for your environment.
So now that we understand what these assessments are you might have noticed that there is no full description of the assessment, no remediation steps, and there is no severity on it. That data is all in a separate API.
Assessment Metadata API
The other API is what is needed to grab the other pieces of data on each assessment. It is called the Assessments Metadata – List API. This API grabs all the metadata information on all assessment types.
https://learn.microsoft.com/en-us/rest/api/defenderforcloud-composite/assessments-metadata/list?view=rest-defenderforcloud-composite-stable&tabs=HTTP
The response from these are a little bigger. Here is an example from the API site:
{
"value": [
{
"id": "/providers/Microsoft.Security/assessmentMetadata/21300918-b2e3-0346-785f-c77ff57d243b",
"name": "21300918-b2e3-0346-785f-c77ff57d243b",
"type": "Microsoft.Security/assessmentMetadata",
"properties": {
"displayName": "Install endpoint protection solution on virtual machine scale sets",
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/26a828e1-e88f-464e-bbb3-c134a282b9de",
"description": "Install an endpoint protection solution on your virtual machines scale sets, to protect them from threats and vulnerabilities.",
"remediationDescription": "To install an endpoint protection solution: 1. <a href=\"https://docs.microsoft.com/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-faq#how-do-i-turn-on-antimalware-in-my-virtual-machine-scale-set\">Follow the instructions in How do I turn on antimalware in my virtual machine scale set</a>",
"categories": [
"Compute"
],
"severity": "Medium",
"userImpact": "Low",
"implementationEffort": "Low",
"threats": [
"dataExfiltration",
"dataSpillage",
"maliciousInsider"
],
"publishDates": {
"GA": "06/01/2021",
"public": "06/01/2021"
},
"plannedDeprecationDate": "03/2022",
"tactics": [
"Credential Access",
"Persistence",
"Execution",
"Defense Evasion",
"Collection",
"Discovery",
"Privilege Escalation"
],
"techniques": [
"Obfuscated Files or Information",
"Ingress Tool Transfer",
"Phishing",
"User Execution"
],
"assessmentType": "BuiltIn"
}
}
Same as the last API there are a few data items in the response that we are looking for. The first again is the name ID. This ID is really important because the name in this response should be the same as an ID in the other API response. Besides that we also want to grab the description, remediation, and the severity of the assessment. By combining these two pieces of data together we should be able to put together a vulnerability.
Structs for Handling Vulnerability Data in JSON Responses
The first thing we need are structs to hold our data in.
Based on those APIs we can kind of assume what we need. The first is we need response structs that will hold the entire response data from each API call. When dealing with API responses, structuring your data correctly is critical for parsing and using it effectively. In my program, I’ve defined two struct types, AssessmentResponse and MetadataResponse, to handle JSON responses from the API:
type AssessmentResponse struct {
Vulnerabilities []DefenderAssessment `json:"value"`
}
type MetadataResponse struct {
Vulnerabilities []AssessmentMetadata `json:"value"`
}
Both of these structs share a Vulnerabilities field, which is a slice containing their respective types. The field is tagged with json:"value". This tag corresponds to the key in the JSON response where the data resides. When we marshal or unmarshal JSON, Go automatically maps the value field from the JSON to the Vulnerabilities field in our struct. The value key will be mapped to the Vulnerabilities field, and Go will populate the slice with the respective data.
The types for those vulnerabilities share the same structure. As you can see below we use the same types of tags in these structs. The only difference is since some of the data fields in these reponses are nested, we need to follow the same structure for these structs. For example the properties key has data keys in it that we want. So we structure our type to also be the same where we have another properties struct and then have the data below it line up with the JSON response data:
type DefenderAssessment struct {
NameID string `json:"name"`
Properties struct {
ResourceDetails struct {
ResourceName string `json:"ResourceName"`
ResourceID string `json:"NativeResourceId"`
}
DisplayName string `json:"displayName"`
Status struct {
Code string `json:"code"`
}
}
}
type AssessmentMetadata struct {
NameID string `json:"name"`
Properties struct {
DisplayName string `json:"displayName"`
Description string `json:"description"`
Remediation string `json:"remediationDescription"`
Severity string `json:"severity"`
}
}
Now that we have our types made, we can go and make a call to these APIs to use those types to hold our data.
Calling the APIs
First thing is we need a couple helper funcs to get a Azure AD token and make a request to the APIs.
func getCred(scope string) (*azcore.AccessToken, error) {
cred, err := azidentity.NewDefaultAzureCredential(nil)
if err != nil {
// Log the error and return it
fmt.Println("Error creating credential in getCred:", err)
return nil, err
}
// Use the default az credential to get a token.
aadToken, err := cred.GetToken(context.Background(), policy.TokenRequestOptions{Scopes: []string{scope}})
if err != nil {
fmt.Println("Error grabbing token:", err)
return nil, err
}
return &aadToken, nil
}
This func will return the token object. Now we can use this in the main func.
aadToken, err := getCred("https://management.azure.com/.default")
if err != nil {
fmt.Println("Error grabbing management AAD token:", err)
return
}
This func will return the token object. Now we can use this in the main func:
aadToken, err := getCred("https://management.azure.com/.default")
if err != nil {
fmt.Println("Error grabbing management AAD token:", err)
return
}
Now we need to get the scope of the API call. So we need a function that ask the user for the scope they want. The API accepts subscriptions or management groups for the scope.
func getScope() (string, bool) {
var scope string
var subscriptionid string
var managementgroup string
var scopeString string
reader := bufio.NewReader(os.Stdin)
for {
fmt.Print("What scope do you want the vulnerability data to be on? Type 'subscription' or 'managementgroup': ")
scope, _ = reader.ReadString('\n') // Read input until newline
scope = strings.TrimSpace(scope) // Trim newline and any spaces
if strings.EqualFold(scope, "subscription") {
fmt.Print("What is your subscription ID: ")
subscriptionid, _ = reader.ReadString('\n') // Read the subscription ID
subscriptionid = strings.TrimSpace(subscriptionid)
if subscriptionid != "" {
scopeString = fmt.Sprintf("subscriptions/%v", subscriptionid)
return scopeString, true
}
fmt.Println("Subscription ID cannot be empty. Please try again.")
} else if strings.EqualFold(scope, "managementgroup") {
fmt.Print("What is your management group name: ")
managementgroup, _ = reader.ReadString('\n') // Read the management group name
managementgroup = strings.TrimSpace(managementgroup)
if managementgroup != "" {
scopeString = fmt.Sprintf("providers/Microsoft.Management/managementGroups/%v", managementgroup)
return scopeString, false
}
fmt.Println("Management group name cannot be empty. Please try again.")
} else {
fmt.Println("Invalid input. Please choose 'subscription' or 'managementgroup'.")
}
}
}
This will return a scope string and a true or false. If true the scope is subscription, if false the scope is a management group.
scopeString, sub := getScope()
var assessmentApi string
if sub {
// For subscription scope
assessmentApi = fmt.Sprintf("https://management.azure.com/%v/providers/Microsoft.Security/assessments?api-version=2021-06-01", scopeString)
} else {
// For management group scope
assessmentApi = fmt.Sprintf("https://management.azure.com/%v/providers/Microsoft.Security/assessments?api-version=2021-06-01", scopeString)
}
***Note I could not get the management group scope to work on the API. Not sure if I have the format wrong or what, but the call does not like to work when I put my management group name into it. I also tried the management group ID and it didn’t work. So I might just be an idiot.
Now we need a function that makes the request and marshals the data into the types. It also needs to handle pagination on the APIs because sometimes all data cannot fit on one API response and it will redirect to another page (This took me forever to figure out, I was missing results and could not figure out why. But using the next page fixed it).
func makeRequest[T any](apiUrl string, token *azcore.AccessToken) ([]T, error) {
var allResults []T
var url = apiUrl
for url != "" {
// Create a new GET request
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return nil, fmt.Errorf("error creating request: %w", err)
}
// Add the token in the request header
req.Header.Add("Authorization", "Bearer "+token.Token)
// Make the request
client := &http.Client{}
res, err := client.Do(req)
if err != nil {
return nil, fmt.Errorf("error making request: %w", err)
}
defer res.Body.Close()
// Read the response body
body, err := io.ReadAll(res.Body)
if err != nil {
return nil, fmt.Errorf("error reading response body: %w", err)
}
// Handle the response generically
var response struct {
Value []T `json:"value"`
NextLink string `json:"nextLink"`
}
// Unmarshal the response into the generic structure
err = json.Unmarshal(body, &response)
if err != nil {
return nil, fmt.Errorf("error unmarshalling response: %w", err)
}
// Append the results to the final slice
allResults = append(allResults, response.Value...)
// Update the URL for the next page
url = response.NextLink
}
return allResults, nil
}
Now a couple more helper funcs to make the requests:
func fetchAssessments(apiUrl string, token *azcore.AccessToken) ([]DefenderAssessment, error) {
// Fetch all assessments
assessments, err := makeRequest[DefenderAssessment](apiUrl, token)
if err != nil {
return nil, fmt.Errorf("failed to fetch assessments: %w", err)
}
return assessments, nil
}
func fetchMetadata(apiUrl string, token *azcore.AccessToken) ([]AssessmentMetadata, error) {
// Fetch all metadata
metadata, err := makeRequest[AssessmentMetadata](apiUrl, token)
if err != nil {
return nil, fmt.Errorf("failed to fetch metadata: %w", err)
}
return metadata, nil
}
And now call them in the main func:
assessments, err := fetchAssessments(assessmentApi, aadToken)
if err != nil {
fmt.Println("Error grabbing vulnerabilities:", err)
}
assessmentMetadata, err := fetchMetadata(metadataApi, aadToken)
if err != nil {
fmt.Println("Error grabbing metadata:", err)
}
Creating Full Vulnerability
Now we need to combine the two pieces of data into one and make a full recommendation/vulnerability. Lets create a func to do this.
func createRecommendation(assessments []DefenderAssessment, metadatas []AssessmentMetadata) []Recommendation {
var filteredAssessments []DefenderAssessment
existingRecommendations := make(map[string]*Recommendation)
//Filter the slice so it is just the unhealthy assessments
for _, assessment := range assessments {
// Check if the assessment is "Unhealthy"; if so, keep it
if assessment.Properties.Status.Code == "Unhealthy" {
filteredAssessments = append(filteredAssessments, assessment)
}
}
// Loop through each filtered assessment
for _, filteredAssessment := range filteredAssessments {
// Check if recommendation for this NameID already exists
if recommendation, found := existingRecommendations[filteredAssessment.NameID]; found {
// Append the resource name to the existing recommendation
recommendation.AffectedResources = append(recommendation.AffectedResources, filteredAssessment.Properties.ResourceDetails.ResourceName)
} else {
// Create a new recommendation since it doesn't exist
newRecommendation := Recommendation{
NameID: filteredAssessment.NameID,
DisplayName: filteredAssessment.Properties.DisplayName, // Use DisplayName from filteredAssessment
Description: "Unknown",
Severity: "Unknown", // You can set default severity or leave it empty for now
AffectedResources: []string{filteredAssessment.Properties.ResourceDetails.ResourceName},
Remediation: "Unknown", // You can leave remediation empty for now
Context: "Unknown",
}
// Add the new recommendation to the map
existingRecommendations[filteredAssessment.NameID] = &newRecommendation
}
}
for _, metadata := range metadatas {
if recommendationid, found := existingRecommendations[metadata.NameID]; found {
recommendationid.Description = metadata.Properties.Description
recommendationid.Severity = metadata.Properties.Severity
recommendationid.Remediation = metadata.Properties.Remediation
}
}
var recommendations []Recommendation
for _, recommendation := range existingRecommendations {
recommendations = append(recommendations, *recommendation)
}
return recommendations
}
Basically, this is going to filter out not applicable and healthy assessments and only keep the unhealthy ones. After it does that is going to loop through them and check if the name IDs are in a map. Remember the name IDs line up on the metadata and assessment, so we only want one recommendation for each name IDs. If we didn’t do this there would be a piece of data for every single resource. I think it is better to only have one vulnerability with all the affected resources inside it .
After checking the name ID to make sure it doesn’t exist, it is going to create a new recommendation with the data fields it has, it will leave the other fields blank for now after doing this it will add the vulnerability to the map.
Now it is going to loop through the slice of metadata and update those unknown fields with data we got from the metadata API.
***Important to note, some of the vulnerabilities did not exist in the metadata API. Not sure if Microsoft just doesn’t have them available yet or what else is going on. But some did not have that extra data for the vulnerability. In these cases we will leave it unknown and rely on the AI to give us that extra data.
At the end we turn the map into a slice and return it. Lets use it in main now.
recommendations := createRecommendation(assessments, assessmentMetadata)
Using AI to Enrich Data
The next piece is where it gets kind of confusing but really interesting. As mentioned earlier I am not a experienced programmer and honestly I will never be a expert dev since that is not my field. Because of this I was not sure of how to achieve concurrency and parallelism in programs. These two things are what allow programs to run multiple task on a CPU at once. Concurrency can launch different task and switch between them, parallelism can actually do multiple tasks at the exact same time.
This becomes really important for the efficiency of this code because we need to make multiple request to OpenAI API for our vulnerabilities. I decided not to send them in a batch and send them one at a time to the API so if we did not have concurrency it would take much longer because every single API call would need a response before we could move to the next call. If you ever used ChatGPT or something similar you would kind of understand that this could take a little bit.
Go however has something called Go routines that hand concurrency and parallelism for you. This allows us to send multiple request to the API at the exact same time, which is return greatly reduces the amount of time the program takes to run and complete.
Lets take a look now at our function that calls the AI to enrich the data:
func aiEnrich(apiKey string, recommendation *Recommendation, wg *sync.WaitGroup, ch chan<- *Recommendation) {
defer wg.Done()
client := openai.NewClient(
option.WithAPIKey(apiKey))
ctx := context.Background()
vulnerabilityName := recommendation.DisplayName
description := recommendation.Description
remediation := recommendation.Remediation
prompt := `
The following is a Azure vulnerability report. You are a Azure cloud security engineer and need to provide additonal info to your team, so try and enrich it with additional details:
The sections should be named **Explanation of the Vulnerability:**, **Remediation Steps:**, **Context about the Impact of the Vulnerability:**
- Provide an expanded explanation of the vulnerability.
- Suggest remediation steps, but keep it kinda short because it needs to be done on alot of vulnerabilities.
- Provide context about the impact of the vulnerability.
- If you are given "unknown", create the three keys regardless with your own information.
Vulnerability Name: ` + vulnerabilityName + `
Description: ` + description + `
Remediation: ` + remediation + `
Response:
`
//print("> ")
//println(prompt)
//println()
completion, err := client.Chat.Completions.New(ctx, openai.ChatCompletionNewParams{
Messages: openai.F([]openai.ChatCompletionMessageParamUnion{
openai.UserMessage(prompt),
}),
Seed: openai.Int(1),
Model: openai.F(openai.ChatModelGPT4o),
})
if err != nil {
panic(err)
}
response := completion.Choices[0].Message.Content
// Parse the response into variables
explanation, remediation, context := parseAIResponse(response)
recommendation.Description = explanation
recommendation.Remediation = remediation
recommendation.Context = context
ch <- recommendation
}
This func needs the apiKey of OpenAI, a pointer to a recommendation, the wait group and the channel that the goroutine will use to store data.
The wait group is used for synchronization when running multiple goroutines. It prevents the program from exiting before all goroutines have completed their tasks. The channel allows the function to send the enriched Recommendation back to the main program or another function for further processing.
We need to create the OpenAI client and create a prompt to give it each time we run the vulnerability through it. In this prompt we tell it how to respond the data given and the format the response should be in. After we do this we can send the prompt with the vulnerability data to the AI and get the response back from it.
There is another func in here we need called parseAIResponse(). This func is just formatting so I won’t cover it in here. It basically takes the response from the AI and splits it up into three variable, explanation, remediation, and context. It then updates those variables in the Recommendation type. If you remember those were unknown so now they will not be unknown.
Lets use this in main now. We need to create the wait group and channel for this func:
var wg sync.WaitGroup
ch := make(chan *Recommendation)
fmt.Println("Enriching the vulnerability data with AI.....")
for i, _ := range recommendations {
wg.Add(1)
go aiEnrich(apiKey, &recommendations[i], &wg, ch)
}
go func() {
wg.Wait()
close(ch)
}()
var enrichedRecommendations []Recommendation
for enrichedRecommendation := range ch {
enrichedRecommendations = append(enrichedRecommendations, *enrichedRecommendation)
}
We create the wait group and the channel that will hold the Recommendations as the goroutines are ran. Here is a little explanation on how the goroutine works with the wait group:
How wg.Add(1) Works
- wg.Add(1) increments the counter in the WaitGroup by 1. It indicates that a new goroutine is starting.
- You call wg.Add(1) once for each goroutine you want to track. For example, if you loop through a list of 10 items and call wg.Add(1) in each iteration, the counter will increment by 10 (one for each goroutine).
How wg.Done() Works
- wg.Done() decrements the counter in the WaitGroup by 1. It signals that a goroutine has finished its work.
- Every goroutine must call wg.Done() when it's done, or the counter will never reach 0, causing wg.Wait() to block indefinitely.
How wg.Wait() Works
- wg.Wait() blocks execution until the counter reaches 0. It ensures that the main program waits for all goroutines to finish before moving forward.
Why close(ch) Happens After wg.Wait()
The close(ch) call happens after wg.Wait() because:
- wg.Wait() ensures all goroutines have finished processing (i.e., the counter is back to 0).
- Once all the goroutines are done and no more data will be sent to the channel, the channel can be safely closed.
Something really interesting about goroutines is it will run as many goroutines as the computer can handle. So in this case if you have a powerful enough computer and lets say a lower number of vulnerabilities like 10 or 15. It could hypothetically run those all at once. By adding the goroutines to this program we can speed it up and save a bunch of time. At the end I will show the program with and without the goroutines and show the execution time of them.
After the goroutines are all done and wg.Wait() is called the channel will close and we can then grab the data out of the channel and put it into a slice of updated recommendations.
Exporting to Excel
The last piece now that we have all of the data is exporting to an excel document. We import github.com/xuri/excelize/v2 as excel and use it in our exportToExcel() func.
func exportToExcel(recommendations []Recommendation, fileName string) error {
f := excel.NewFile()
sheetName := "Recommendations"
f.SetSheetName(f.GetSheetName(0), sheetName)
headers := []string{"DisplayName", "Description", "Severity", "AffectedResources", "Remediation", "Context"}
// Write headers
for i, header := range headers {
col := fmt.Sprintf("%c", 'A'+i) // Convert index to a column letter
cell := fmt.Sprintf("%s1", col)
if err := f.SetCellValue(sheetName, cell, header); err != nil {
return fmt.Errorf("failed to set header cell %s: %w", cell, err)
}
}
// Write data rows
for rowIndex, rec := range recommendations {
row := rowIndex + 2 // Start from row 2 (after headers)
if err := f.SetCellValue(sheetName, fmt.Sprintf("A%d", row), rec.DisplayName); err != nil {
return err
}
if err := f.SetCellValue(sheetName, fmt.Sprintf("B%d", row), rec.Description); err != nil {
return err
}
if err := f.SetCellValue(sheetName, fmt.Sprintf("C%d", row), rec.Severity); err != nil {
return err
}
if err := f.SetCellValue(sheetName, fmt.Sprintf("D%d", row), strings.Join(rec.AffectedResources, ", ")); err != nil {
return err
}
if err := f.SetCellValue(sheetName, fmt.Sprintf("E%d", row), rec.Remediation); err != nil {
return err
}
if err := f.SetCellValue(sheetName, fmt.Sprintf("F%d", row), rec.Context); err != nil {
return err
}
}
// Save the file
if err := f.SaveAs(fileName); err != nil {
return fmt.Errorf("failed to save Excel file: %w", err)
}
fmt.Printf("File successfully saved as: %s\n", fileName)
return nil
}
Basically this func is going to create a new file, write the headers in the first row, loop through the slice of recommendations it takes as a parameter, and write the vulnerability data of each vulnerability to the correct column header. The headers line up with the data we have in the Recommendation struct. At the end we save the files and now our program should be complete.
Testing Program Out
Now lets run it and see if everything works as it should. I am going to blur out the key and subscription ID so I don’t leak anything.
Ok so now as mentioned above, I am going to show you the run time for when the goroutines are removed and every vulnerability is ran through OpenAI one at a time. The time difference is actually crazy.
The screenshot is different because I took this before completing the program, just when I was testing things. As you can see though, it took nearly a full 4 extra minutes to complete the program! The goroutines cut almost 4 minutes off the execution time! I didn’t change anything either from the scope, it was the same subscription with the same amount of vulnerabilities.
Now we can take a look at the excel sheet to see some of the recommendations it pops out.
Here is a screenshot of a couple of the items from the excel document:
One thing to note is a left the unknown in the severity column for one of them. This means that item #27 did not have any metadata on the vulnerability. So in this example that description is entirely made from OpenAI. All of the other ones were fed a description from the metadata and then the AI improved it, but this one just created it itself which is really cool.
Now here are a couple of the remediation and context columns for the vulnerabilities.
The context column shows the potential impact of not remediating the vulnerability and the remediation give steps for remediating the vulnerability
Conclusion
This is the end of the blog. This was a great project to help me improve my coding skills and learn Go a little better. It also could be helpful for some people to use. A lot of companies use Defender for Cloud as a CSPM and this program can pull all of the vulnerabilities out of it and into an excel document that can be easily shared. Also because of the AI integration you can get more helpful data than you might have got by just looking in the portal or exporting a document from the portal.
So, it you made it this far congrats. Hopefully you enjoyed reading through this and maybe even learned something. The link to the full source codes is: https://github.com/bauerbrett/AzureDefenderVulnerabilities